My Learning Journey for Hadoop
Hadoop has been one of the most interesting technical topics I learnt in last few years. There are so many topics intertwined with Hadoop like Big Data, MapReduce, YARN, Flume etc. that the most difficult part is decide a good learning path.
In this series of blog posts, I would like to explain the learning path that I have followed along with detail blog on concepts and hands-on.
Target Readers: If you are a developer or consultant looking to get first level of expertise in Hadoop and its related technologies, this series of blogs is for you. Please let me know in comment if I missed any important topic.
“A picture is worth a thousand words” – Keeping this in mind, I have tried to explain with less words and more images. Let me know know in comment if this is helpful or not ?
Chapter One – Kick Start with Hadoop
In this section, we will learn basics of Hadoop along with a Hello World example. We will learn in 3 steps.
- Step 1: Understand Why Hadoop is needed?
- Step 2: Learn what is Hadoop?
- Step 3: Run Hello World program in Hadoop
Step 1: Understand Why Hadoop is needed?
The best and most easy way to start with any NEW thing is to find out, why?
In brief we can say Hadoop came to solve Big Data challenges.
Go through this blog to know more – What is Big Data and Why do we need Hadoop for Big Data?
Step 2: Learn what is Hadoop?
Once we have got answers for origin of Hadoop, let us try getting high level idea of Hadoop architecture and ecosystem.
We first need the high-level general idea without focusing on the details. Following blogs explains the basics.
Step 3: Run Hello World program in Hadoop
So far, we talked only about concepts and theory. Let us now do something interesting in Hadoop.
Let us run our first Hello World program of Hadoop in sandbox system.
No Installation – Use Hadoop Sandbox System
Instead of installing Hadoop, we can download sandbox systems offered by Hadoop distributions like Cloudera, Hortonworks. These sandbox systems provide you a great way to get started with Hadoop.
The sandbox systems are like your personal Hadoop environment that comes along with many interactive Hadoop tutorials.
We will further use this sandbox to run our Hello World example. In the Hello World example, we will use HDFS, HCatalog and Hive.
Go through this blog to setup sandbox system and run Hello World example –
Let us now add one more component to our example – PIG.
Hello World program in Hadoop using HDFS, HCatalog and Pig – Coming soon ?
Chapter Two – A Hadoop Real World Example
Prerequisite: You should finish chapter one.
The Hello World program mentioned above, doesn’t reflect the strength and range of the Hadoop, it’s just shown as a sample hands-on to make you familiar with the Hadoop World. For sure, it does not show the real power of Hadoop.
I will soon publish a blog post which will show how to solve a real-world problem using Hadoop.