If you want your multi node cluster to be rack aware you need to do a few things. The following is to be done only on the master (namenode) only.
- nano /home/myuser/rack.sh
With the following contents
- #!/bin/bash
- # Adjust/Add the property "net.topology.script.file.name"
- # to core-site.xml with the "absolute" path the this
- # file. ENSURE the file is "executable".
- # Supply appropriate rack prefix
- RACK_PREFIX=myrackprefix
- # To test, supply a hostname as script input:
- if [ $# -gt 0 ]; then
- CTL_FILE=${CTL_FILE:-"rack.data"}
- HADOOP_CONF=${HADOOP_CONF:-"/home/myuser"}
- if [ ! -f ${HADOOP_CONF}/${CTL_FILE} ]; then
- echo -n "/$RACK_PREFIX/rack "
- exit 0
- fi
- while [ $# -gt 0 ] ; do
- nodeArg=$1
- exec< ${HADOOP_CONF}/${CTL_FILE}
- result=""
- while read line ; do
- ar=( $line )
- if [ "${ar[0]}" = "$nodeArg" ] ; then
- result="${ar[1]}"
- fi
- done
- shift
- if [ -z "$result" ] ; then
- echo -n "/$RACK_PREFIX/rack "
- else
- echo -n "/$RACK_PREFIX/rack_$result "
- fi
- done
- else
- echo -n "/$RACK_PREFIX/rack "
- fi
Set execute permissions
- sudo chmod 755 rack.sh
Create the data file that has your rack information. You must be very careful not to have too many spaces between the host and the rack.
- namenode_ip 1
- secondarynode_ip 2
- datanode1_ip 1
- datanode2_ip 2
The last step is to update core-site.xml file located in your hadoop directory.
- nano /usr/local/hadoop/etc/hadoop/core-site.xml
Set the contents to the following of where your rack.sh file is located.
- <property>
- <name>net.topology.script.file.name</name>
- <value>/home/myuser/rack.sh</value>
- </property>
2 thoughts on “Hadoop: Rack Awareness”
Comments are closed.