Hadoop: Add a New DataNode

(Last Updated On: )

DataNode:
Use rsync from one of the other datanodes you previously setup. Ensure you change datanode specific settings you configured during installation.

  1. hadoop-daemon.sh start datanode
  2. start-yarn.sh

NameNode:

  1. nano /usr/local/hadoop/etc/hadoop/slaves

Add the new slave hostname

  1. hadoop dfsadmin refreshNodes

Refreshes all the nodes you have without doing a full restart

When you add a new datanode no data will exist so you can rebalance the cluster to what makes sense in your environment.

  1. hdfs balancer threshold 1 include ALL_DATA_NODES_HOSTNAME_SEPERATED_BY_COMMA