Hadoop Upgrade from 0.20 Caltech to 0.20 OSG

This notes comes from our experience upgrading our Tier3 from 1.2.x. Based on the OSG instructions (references) and adapted to our topology (rocks 5.4). For updated instructions check first OSG instructions.

References

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HadoopOverview

https://www.opensciencegrid.org/bin/view/Documentation/Release3/UpgradeHadoop

Before upgrade requirements

Make sure to remove repository for hadoop-0.20 Caltech.

yum remove osg-hadoop

install yum repositories needed

For datanodes a local mirror is faster.

wget http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
rpm -i epel-release-5-4.noarch.rpm 
yum install yum-priorities
rpm -Uvh http://repo.grid.iu.edu/osg-release-latest.rpm
#or use /etc/yum.repos.d/osg.repo
#with osg.repo pointing to a local mirror

umount /mnt/hadoop
/etc/init.d/hadoop stop

upgrade hadoop

install hadoop

yum update hadoop-0.20-osg
#recrate broken link
ln -sf /etc/alternatives/hadoop-log /var/log/hadoop
#local configuration /etc/hadoop-uprm
#rm -f /etc/alternatives/hadoop-etc
#ln -s /etc/hadoop-uprm  /etc/alternatives/hadoop-etc

#start hadoop
service hadoop start
#mount fuse
mount /mnt/hadoop