31 January 2009

Grid Computing with GridGain

Similar to the famous "one word - Plastics" scene in "The Graduate", a senior engineer took me aside on the weekend of the Columbia explosion and said "Grid Computing". We were working long hours for a proposal that was insane at best. At the time, I had heard of SETI@home but never really had spent any time on the subject. Tim was pretty excited about the topic. He had just read an article in some Hype-of-the-Day rag and was sure that Grid Computing was going to be the solution we needed for our proposal.

We ended up winning the job which is a really big deal. I think we're up to around $400M now over the past few years. The app is a distributed system (not a web app) but doesn't use Grid Computing and never will. It's tied into the customer's crazy network dreams and their infrastructure framework that locks our applications to theirs. Actually, it's brilliant in a business way. Our apps can't ever be used for anything other than for their original purposes.

For the past few months, I've been following the blog @Gridify Cloud Computing via Google Reader. Honestly I don't know why. I must have stumbled on an interesting entry one day and added then to the reader for future ref. This weekend, I was scanning the blogs and I saw a GridGain blog that started with:
When executing tasks and jobs on the grid you may be faced with the question: "How do I make sure that tasks from other users are not executed on nodes started by me?"
We're asking ourselves the same question on my project so the entry piqued my interest. The GridGain solution can't help me in anyway but it got me to their website where I started to poke around. I watched a few short videos on GridGain installation and creating an app in 15 minutes. Ok, I'm interested, let's do this.

Installation is pretty easy, just go to the downloads page and grab the installer for your OS. I'm using linux, so I got the GUI installer gridgain-unix-2.1.0.sh and ran it like this:
sh ./gridgain-unix-2.1.0.sh
Pretty easy. I decided to install under my projects folder so next I setup environment variables for GRIDGAIN_HOME in ~/.bashrc like this:
export GRIDGAIN_HOME=~/projects/gridgain-2.1.0
I haven't used Java much lately and was happy to see that GridGain had Groovy examples. I attempted to run the examples and immediately ran into difficulties. First was my fault. I un-installed Groovy a bit ago and needed to get the latest distribution. I did:
sudo apt-get install groovy
And presto, groovy was re-installed. A quick sanity check verified that it was installed:
groovy -v
This reveals that groovy 1.5.2 was installed. Hmmm, I swore that a newer version was out but 1.5.2 should be good for the GridGain demos, right? Right.

I stepped up to the first example compileGridify.sh. It immediately complained that GROOVY_HOME wasn't set. Back to ~/.bashrc and I added this line:
export GROOVY_HOME=/usr/share/groovy
Then the script started to complain that class Gridify couldn't be found. This is GridGain's own example, the thing should run almost out-of-the-box. Durn it! A little more digging reveals the the compile example referenced gridgain.jar in the classpath when the actual jar is called gridgain-2.1.0.jar. I started to edit the file and realized that there were probably other scripts in the GridGain install that may have the same error. Rather than go on an edit-fest, I simply created a soft link in GRIDGAIN_HOME like this:
ln -s gridgain-2.1.0.jar gridgain.jar
Yeah! it compiled. Now to the run script. The runGridify.sh complained that it couldn't find the groovy.lang.GroovyObject class. Kinda weird cause that's the momma groovy class. Sounds like another classpath issue. And it was. The run script references embeddable/groovy-all-1.5.7.jar. Problem is that not only did I have version 1.5.2, that version doesn't even include a groovy-all.jar. I wonder why? I then went to groovy home and looked around. Version 1.5.6 has a Debian package but the newest 1.5.7 doesn't. Sounds like 1.5.6 is the next to try. After installation, I looked at GROOVY_HOME and yes!! It has an embeddable folder and a groovy-all-1.5.6.jar. Now a quick edit to the runGridify.sh file and back in business.

The gridify example starts up a default configuration and prints out "Hello World" on a remote node. Nothing too exciting but impressive if you look at all that really is happening in the background.

The other example compileHelloWorld.sh and runHelloWorld.sh have the exact same issues with classpath jars. This example is similar to Gridify but prints "Hello" on one node and "World" on another. To do this, start another node in a terminal before running runHelloWorld.sh :
./gridgain.sh
There's many cool things that can be done with GridGain. One interesting thing is running JUnit tests after a build in a parallel manner. The GridGain guy claims that running their test on a single box takes about 3 hours. Gridifying the Ant task reduces that to about 20 minutes.

Take a look at the videos and other stuff on the website to get started. As well, DZone has a pile of hits. Search DZone for gridgain

I reported these version probs to the GridGain user forum. They made the fix and updated the wiki the next day. Nice.

http://www.gridgainsystems.com/wiki/display/GG15UG/Groovy+Configuration+And+Setup

No comments:

Post a Comment