Add Nodes to pseudo Hadoop cluster

Former Member · ‎06-19-2014

In this blog, I assume a single node hadoop cluster has been set up and we want to add more slave nodes to the cluster. For more information about setting up single node cluster, please refer to "Access to Hive from HANA - Section 1 Hadoop Installation"

Setting up a hadoop slave is similar to setting up the pseudo Hadoop we already have, so please follow "Access to Hive from HANA - Section 1 Hadoop Installation" to set up Hadoop on your slaves machine. Then we need to do some configurations on both hadoop and hostname to set the connection between a master node and slave nodes.

In master node,

edit IPV4 address in /etc/hosts like following:

127.0.0.1 localhost

xxx.xxx.xxx.xxx Full-domain-name master

xxx.xxx.xxx.xxx Full-domain-name slave1

xxx.xxx.xxx.xxx Full-domain-name slave2

.......

list all the slaves you have with ip address and full-domain-name and a short-name. You may find your IP address by running ifconfig as root.

still in the master node, edit core-site.xml and yarn-site.xml file to tell hadoop where the master node is.

In the core-site.xml, modify the fs.default.name property as:

<property>

<name>fs.default.name</name>

<value>hdfs:master:8020</value>

</property>

in the yarn-site.xml, modify the yarn.resourcemanager.hostname property as :

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

then edit slaves file in the same directory ( $HADOOP_HOME/etc/hadoop) as:

master

slave1

slave2

....

list all the slaves node short-name you defined in the /etc/hosts file.(NOTICE: master node can be both used as master and slave in hadoop)

For all the slave nodes,

copy the configurations files from master node by

scp user2@hostnames:$HADOOP_HOME/etc/hadoop/file user2@hostnames:$HADOOP_HOME/etc/hadoop/file

(NOTICE: For test, we just need copy core-site.xml, hdfs-site.xml and yarn-site.xml from master to slaves. The aim of copying the configuration files is to let the slaves know who is their boss!)

You may want to modify dfs.datanode.data.dir in hdfs-site.xml to define where to store data on each slave.

next, run hadoop format -datanode to format datanode on each slave and

copy storageID in dfs.datanode.data.dir/current/VERSTION from master node to slave nodes.

the last step is to start slaves. we just need to start datanode and nodemanager on slave nodes by:

hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager

Add Nodes to pseudo Hadoop cluster

Now live: 2014 SAP HANA and SAP HANA Cloud Applications Challenge voting

My Personal Ux, Fiori, Portal and Cloud Cheat Sheet

Web Dynpro ABAP Demonstration Videos