Two years ago, I published a Maven archetype for Hadoop that turned out to be quite popular, judging from the comments I received and the access logs on my server. Today I've updated it to use the latest version of Hadoop and to build on Maven 3 without warnings.
In my last article I showed how to build a Hadoop job that contains all its dependencies. To make things even easier, I created a Maven archetype that turns project setup into a simple 30 second process.
To generate a new project run the following command (on one line):
Non-trivial Hadoop jobs usually have dependencies that go beyond those provided by the Hadoop runtime environment. That means, if your job needs additional libraries you have to make sure they are on Hadoop's classpath as soon as the job is executed. This article shows how you can build a self-contained …