Setting up HADOOP made easy

Mithun_K · ‎03-25-2014

This blog talks about the HADOOP installation.

It takes at the max 2 hours for the installation if you are lucky :smile:

Please follow the below steps:

Step-1:

1. Download a stable release ending with tar.gz (hadoop-1.2.1.tar.gz)

2. In Linux, create a new folder “/home/hadoop”

3. Move the downloaded file to the folder “/home/hadoop” using Winscp or Filezilla.

4. In putty type: cd /home/hadoop

5. Type: tar xvf hadoop-1.2.1.tar.gz

Step-2:

Downloading and setting up java:

1.Check if Java is present

Type: java –version

2. If java is not present, please install it by following the below steps

3. Make a directory where we can install Java (/usr/local/java)

4. Download 64-bit Linux Java JDK and JRE ending with tar.gz from the below link:

http://oracle.com/technetwork/java/javase/downloads/index.html

5. Copy the downloaded files to the created folder

6. Extract and install java:

Type: cd /usr/local/java

Type: tar xvzf jdk.*.tar.gz

Type: tar xvzf jre.*.tar.gz

7. Include all the variables for path and Home directories in the /etc/profile at the end of file

JAVA_HOME=/usr/local/java/jdk1.7.0_40

PATH=$PATH:$JAVA_HOME/bin

JRE_HOME=/usr/local/java/jre1.7.0_40

PATH=$PATH:$JRE_HOME/bin

HADOOP_INSTALL=/home/hadoop/hadoop-1.2.1

PATH=$PATH:$ HADOOP_INSTALL /bin

Export JAVA_HOME

Export JRE_HOME

Export HADOOP_INSTALL

8. Run the below commands so that Linux can understand where Java is installed:

sudo update-alternatives --install "/usr/bin/java" "java" "/usr/local/java/jre1.7.0_40/bin/java" 1

sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.7.0_40/bin/javac" 1

sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/local/java/jre1.7.0_40/bin/javaws" 1

sudo update-alternatives –set java /usr/local/java/ jre1.7.0_40/bin/java

sudo update-alternatives –set javac /usr/local/java/jdk1.7.0_40/bin/javac

sudo update-alternatives –set javaws /usr/local/java/jre1.7.0_40/bin/javaws

9. Test Java by typing Java –version

10. Check if JAVA_HOME is set by typing: echo $JAVA_HOME

Now we are done with the installation of Hadoop (Stand alone mode). :smile:

Step-3:

We can check if we are successful by running an example.

Go to Hadoop Installation directory

Type: mkdir output

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

Type: ls output/*

The output is displayed with the success.

Step-4:

As a next step, change the configuration in the below files:

1. In the Hadoop installation folder change /conf/core-site.xml file to:

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

2. Change /conf/hdfs-site.xml:

<name>dfs.replication</name>

</property>

</configuration>

3. Change /conf/mapred-site.xml:

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

4. Edit /conf/hadoop-env.sh file:

export JAVA_HOME=/usr/local/java/ jdk1.7.0_40

Step-5:

1. Setup password less ssh by running the below commands:

Type: ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Type: cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

2. To check if the ssh password is disabled

Type: ssh localhost (It should not ask any password)

3. Format the name node:

Type: /bin/hadoop namenode –format

Step-6:

To start all the Hadoop services:

Type: /bin/start-all.sh

Now try the same example which we tried earlier:

Type: bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+’

It should give the output.

To stop all the Hadoop services:

Type: /bin/stop-all.sh

Setting up HADOOP made easy

Now live: 2014 SAP HANA and SAP HANA Cloud Applications Challenge voting

My Personal Ux, Fiori, Portal and Cloud Cheat Sheet

Web Dynpro ABAP Demonstration Videos