Skip to Content
Technical Articles

My Learning Journey for Hadoop

Hadoop has been one of the most interesting technical topics I learnt in last few years. There are so many topics intertwined with Hadoop like Big Data, MapReduce, YARN, Flume etc. that the most difficult part is decide a good learning path.

In this series of blog posts, I would like to explain the learning path that I have followed along with detail blog on concepts and hands-on.

Target Readers: If you are a developer or consultant looking to get first level of expertise in Hadoop and its related technologies, this series of blogs is for you. Please let me know in comment if I missed any important topic.

 

β€œA picture is worth a thousand words” – Keeping this in mind, I have tried to explain with less words and more images. Let me know know in comment if this is helpful or not πŸ™‚

Chapter One – Kick Start with Hadoop

Prerequisite: None

In this section, we will learn basics of Hadoop along with a Hello World example. We will learn in 3 steps.

  • Step 1: Understand Why Hadoop is needed?
  • Step 2: Learn what is Hadoop?
  • Step 3: Run Hello World program in Hadoop

Step 1: Understand Why Hadoop is needed?

The best and most easy way to start with any NEW thing is to find out, why?

In brief we can say Hadoop came to solve Big Data challenges.

Go through this blog to know more – What is Big Data and Why do we need Hadoop for Big Data?

Step 2: Learn what is Hadoop?

Once we have got answers for origin of Hadoop, let us try getting high level idea of Hadoop architecture and ecosystem.

We first need the high-level general idea without focusing on the details. Following blogs explains the basics.

Step 3: Run Hello World program in Hadoop

So far, we talked only about concepts and theory. Let us now do something interesting in Hadoop.

Let us run our first Hello World program of Hadoop in sandbox system.

 

No Installation – Use Hadoop Sandbox System

Instead of installing Hadoop, we can download sandbox systems offered by Hadoop distributions like Cloudera, Hortonworks. These sandbox systems provide you a great way to get started with Hadoop.

The sandbox systems are like your personal Hadoop environment that comes along with many interactive Hadoop tutorials.

We will further use this sandbox to run our Hello World example. In the Hello World example, we will use HDFS, HCatalog and Hive.

Go through this blog to setup sandbox system and run Hello World example –

Hello World program in Hadoop using Hortonworks Sandbox

 

Let us now add one more component to our example – PIG.

Hello World program in Hadoop using HDFS, HCatalog and Pig – Coming soon 😊

Chapter Two – A Hadoop Real World Example

Prerequisite: You should finish chapter one.

The Hello World program mentioned above, doesn’t reflect the strength and range of the Hadoop, it’s just shown as a sample hands-on to make you familiar with the Hadoop World. For sure, it does not show the real power of Hadoop.

I will soon publish a blog post which will show how to solve a real-world problem using Hadoop.

 

Happy Learning!

2 Comments
You must be Logged on to comment or reply to a post.