Linux in Big Data projects
Hey guys, we will be interested in learning from your experience in using Linux in Big Data projects. Has anyone used Hadoop, or MapR or Horton Works on Linux and any experiences you may have had on these. I am more interested in knowing if a certain distribution of Linux is better supported for Hadoop and why? Also would like to know if anyone is using Gluster, and if so, are there any other alternatives similar to Gluster?
We've been using Hadoop for approx. 2 years now at my workplace (on RH Enterprise Linux 5 boxes mostly, if it matters). Since the software is written in java, it is pretty portable to other distros as well.
MapR is actually a company that adds some nice stuff on top of standard Hadoop (e.g. better management tools, MapR filesystem, snapshots etc.). We were in discussions with them to use some of their stuff on our hadoop cluster, but nothing has been signed so far.
Up the Irons
What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
"Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
Down with Sharon Osbourne
"I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo