DataNode:
Use rsync from one of the other datanodes you previously setup. Ensure you change datanode specific settings you configured during installation.
- hadoop-daemon.sh start datanode
- start-yarn.sh
NameNode:
- nano /usr/local/hadoop/etc/hadoop/slaves
Add the new slave hostname
- hadoop dfsadmin –refreshNodes
Refreshes all the nodes you have without doing a full restart
When you add a new datanode no data will exist so you can rebalance the cluster to what makes sense in your environment.
- hdfs balancer –threshold 1 –include ALL_DATA_NODES_HOSTNAME_SEPERATED_BY_COMMA