debian package of hadoop
Hi,
today I tried to run the cloudera debian dist on a 4 machine cluster. I still
have some itches, see my list below. Some of them may require a fix in the
packaging.
Therefor I thought that it may be time to start an official debian package of
hadoop with a public GIT repository so that everybody can participate.
Would cloudera support this? I'd package hadoop 0.20 and apply all the
cloudera patches (managed with topgit[1]).
At this point I'd like to have your opinion whether it would be wise to have
versioned binary packages like hadoop-18, hadoop-20 or just plain hadoop for
the Debian package?
My issues so far:
start-dfs.sh started only the local namenode, not the secondary NN nor the
datanodes without indicating any error. masters and slaves were configured
correctly.
When starting the datanodes manually they were recognized and HDFS worked.
However the web UIs at port 50075 show only a directory view with a WEB-INF/
directory in it. This may most likely be a packaging/configuration issue.
The same with the SNN on port 50090.
The SNN shows:
java.io.FileNotFoundException:
http://192.168.122.166:50070/getimage?putimage=1&port=50090&machine=127.0.1.1&token=-18:737152035:0:1262195990000:1262194649873
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1288
I did not get MapReduce working yet. It seems that it would be very helpfull
to have some more verbose example configuration files. These should have
commented properties for the most important settings.
[1] http://lists.alioth.debian.org/pipermail/vcs-pkg-discuss/2009-
December/000688.html
Have a nice New Year's Eve,
Thomas Koch, http://www.koch.ro
Reply to: