Sunday, December 25, 2011

Adventures in Ubuntu and Hadoop Part 1

One of my goals for 2012 is to learn Hadoop. So I want to install it somewhere. I have a nice Windows 7 Laptop used for most development. Hadoop only "sortof runs" on Windows, for basic development only, and then only with Cygwin. Now, I know many people who love and swear by Cygwin, but I'm in the other camp - in admittedly limited experience, I disliked and swore at Cygwin. Besides, I've wanted to play with Linux for a while anyway. Twenty years ago I knew how to use vi and ls. How hard could it be? And, in reorganizing / swapping offices with my wife, that freed up an old 160 GB disk and some RAM to upgrade my 7 year old desktop, a Gateway 503GR. So, why not install Linux on the 2nd drive, and then Hadoop?

The nice thing about Linux is that you can spend weeks deciding which "distro" is best. Fortunately, some Googling revealed that, for my old machine, Ubuntu definitely ran, and Ubuntu got good reviews as relatively "easy". I grabbed a book from the local library (Ubuntu Linux by Willian von Hagen) and got started.

Downloading the ISO image for Ubuntu 11.10 to my Windows machine was pretty easy. Fortunately I already had a utility program to burn ISO images to CD. So far, so good. Transferring the hardware was easy too. My 1G RAM and 160GB disk was quickly upgrded to 2G and 2x160GB. My plan was to keep Windows XP on the original disk and install Linux on the 2nd one.

O.K., boot up and hit F2. Well, the first try my timing was off, but I did notice that Windows saw the new RAM and disk. Second time I got the timing right and opened in "Try Ubuntu" mode. I wanted to look at the disks to make sure of which was which. Turned out that disk "b" was the new one. Then clicked install. I got a surprising dialog about "unmounting" the disks, thought a bit, said what the hey and did that. Said to install on disk "b" and everything went pretty smoothly. Except, I noticed that the "look for updates" box was disabled. My network adapter (on the motherboard) is getting cranky, especially after the computer is totally disconnected from AC power. Anyway, it installed fine.

Time to reboot. To my pleasant surprise, the "Grand Unified Boot Loader" (GRUB) worked just as advertised, and I could boot into either Linux or Windows. Ubuntu recognized my NVidia card and I was able to get both monitors configured quickly. Just needed some playing with the power cable to trick the network adapter into working.

So far, so good. Now, to install Java. Right now Java 6 is very well established, at update 30, while Java 7 is pretty new. I'm not using any of the new Java 7 yet. Java SE 6 update 30 it is. Now the trouble begins. Now I'm sure that somebody knows why, (here's a link) but Ubunti and the "official" Oracle Java JDK don't get along. In that Ubuntu doesn't feature it in their Software Center or Synaptic Package Manager. They do feature the Open JDK stuff, which has some mixed reviews. Besides, that would be too easy - what would I learn with that? I downloaded the official .bin package, not the rpm.bin. Moved it deep in the bowels of /opt, then ran it (eventually) via

sudo ./jdk-6u30-linux-i586.bin

Geez, you need to type sudo a lot! But java -version didn't work. And my feeble attempts to add java/bin to the PATH didn't seem to work either.

The magical incantation, found by Googling, was:

/usr/bin$ sudo update-alternatives --install /usr/bin/java java /opt/java/32/jdk1.6.0_30/bin/java 100
sudo update-alternatives --config java

The first command will put a java link in /usr/bin that points to /etc/alternatives/java, which is a link that points to the actual stuff in /opt. Anyway, this worked, and java -version happily printed out java version "1.6.0_30"

BTW, I since found a more complete web site with the gory details. Looks like I still have a little work to do.

Before moving on to Hadoop (which will be the subject of a future post) I thought I should install Eclipse. Downloaded the latest, eclipse-java-indigo-SR1-linux-gtk.tar.gz, put it into /opt/eclipse, and the typed my first tar xzf in years.

tar xcf ec*

Drat, need yet another sudo,

sudo tar xzf ec*

Worked! A little more research found a simple script to put into /usr/bin to launch it. Create /usr/bin/eclipse with:

export ECLIPSE_HOME="/opt/eclipse/eclipse"

$ECLIPSE_HOME/eclipse $*

Viola! I'll probably want to add some settings for memory usage, but, for now, Eclipse started up and asked for me to Select a workspace etc... It is pointing at the correct JRE. And a quick "Hello World" worked.

Stay tuned for the next exciting episode, "Ricky and Bullwinkle wrestle with Hadoop", or, "How many more times will I forget the sudo wrestler?". :-)


  1. This post is really too informative to us, good perception of images and good description by which any one can get information what they want to this post.......As for as my thinking is concerned this one is the best post.

    Thanks for sharing such a informative post.
    Power Cables

  2. Glad you liked it. Yes, it is really more of a "dump" of a lot of colorful information and issues than a step-by-step instruction manual. Hopefully the links will help with the gory details.