Technical Articles
My Learning Journey for Hadoop
Hadoop has been one of the most interesting technical topics I learnt in last few years. There are so many topics intertwined with Hadoop like Big Data, MapReduce, YARN, Flume etc. that the most difficult part is decide a good learning path.
In this series of blog posts, I would like to explain the learning path that I have followed along with detail blog on concepts and hands-on.
Target Readers: If you are a developer or consultant looking to get first level of expertise in Hadoop and its related technologies, this series of blogs is for you. Please let me know in comment if I missed any important topic.
“A picture is worth a thousand words” – Keeping this in mind, I have tried to explain with less words and more images. Let me know know in comment if this is helpful or not ?
Chapter One – Kick Start with Hadoop
Prerequisite: None
In this section, we will learn basics of Hadoop along with a Hello World example. We will learn in 3 steps.
- Step 1: Understand Why Hadoop is needed?
- Step 2: Learn what is Hadoop?
- Step 3: Run Hello World program in Hadoop
Step 1: Understand Why Hadoop is needed?
The best and most easy way to start with any NEW thing is to find out, why?
In brief we can say Hadoop came to solve Big Data challenges.
Go through this blog to know more – What is Big Data and Why do we need Hadoop for Big Data?
Step 2: Learn what is Hadoop?
Once we have got answers for origin of Hadoop, let us try getting high level idea of Hadoop architecture and ecosystem.
We first need the high-level general idea without focusing on the details. Following blogs explains the basics.
Step 3: Run Hello World program in Hadoop
So far, we talked only about concepts and theory. Let us now do something interesting in Hadoop.
Let us run our first Hello World program of Hadoop in sandbox system.
No Installation – Use Hadoop Sandbox System
Instead of installing Hadoop, we can download sandbox systems offered by Hadoop distributions like Cloudera, Hortonworks. These sandbox systems provide you a great way to get started with Hadoop.
The sandbox systems are like your personal Hadoop environment that comes along with many interactive Hadoop tutorials.
We will further use this sandbox to run our Hello World example. In the Hello World example, we will use HDFS, HCatalog and Hive.
Go through this blog to setup sandbox system and run Hello World example –
Hello World program in Hadoop using Hortonworks Sandbox
Let us now add one more component to our example – PIG.
Hello World program in Hadoop using HDFS, HCatalog and Pig – Coming soon ?
Chapter Two – A Hadoop Real World Example
Prerequisite: You should finish chapter one.
The Hello World program mentioned above, doesn’t reflect the strength and range of the Hadoop, it’s just shown as a sample hands-on to make you familiar with the Hadoop World. For sure, it does not show the real power of Hadoop.
I will soon publish a blog post which will show how to solve a real-world problem using Hadoop.
Happy Learning!
Thanks for the blog Dolly Mishra but none of the links are opening, might be because those blogs are not published.
Hi Nabheet,
Thanks a lot for the feedback. Yes, The other blogs are in pending for approval and will be published soon 🙂
Great tutorial for beginners, looking forward to reading your next posts ! Chapter 2 still has not been published. Waiting..