Fitting Hadoop in an SAP Software Landscape – Part 3 ASUG Webcast
This is the third and final part of my notes from an ASUG webcast. Part 1 Big Data in an SAP Landscape – ASUG Webcast Part 1 and Part 2 is Using HANA and Hadoop, Key Scenarios – Part 2 ASUG Big Data Webcast
This blog covers reference architecture, some sample use cases and implementation guidance.
Reference Architecture
Figure 1: Source: SAP
Figure 1 covers how to connect technologies – see the CIO Guide on Big Data “How to Use Hadoop … | SAP HANA for more details.
Assume you have HANA, a data warehouse, and Hadoop, you can use SAP Data Services to help you store that in HDFS
You can use Data Services to help enhance the data and move the data into Hadoop and out of Hadoop
With streaming events in real time you can use the complex event processor
You can use the archive solutions in Hadoop
For landscape management you can look to Intel/Hortonworks, a SAP partner solution
Analytics can be used
You can use data governance to manage the data
A Hadoop development environment may be needed
You do not need to implement them all.
Sample Use Cases
Figure 2: Source: SAP
Figure 2 covers a use case for a computer server hardware manufacturer wants to be more familiar with customer problems.
They capture customer call center data and store that in Hadoop and determine potential problems in servers.
They took those call results and merged them with hardware monitoring logs and tried to correlate and pull together
They pulled this together with CRM and BOM together to get a complete picture of problems they were experiencing
Figure 3: Source: SAP
Figure 3 is an example of a high tech company who wanted to move from their current data warehouse (not SAP) to move to HANA and Hadoop
It was a four step process.
Step 1 was to data from business suite and replicate in Hadoop
Step 2 was to aggregate in Hadoop and take them to the data warehouse
Step 3 was to take aggregate data into HANA
Last stage (step 4) they can use SAP BI tools on HANA and smart data access on raw data
Eventually they will do away with their data warehouse
Implementation Guidance for implementing Hadoop
Figure 4: Source: SAP
How start implementing? See the CIO guide
Figure 5: Source: SAP
Figure 5 covers “good ideas to follow”
The SAP speaker, David Burdett, said to “consider all your options – don’t assume HANA or Hadoop is the answer”.
Hadoop is another technology for handling data and it should not be separate but part of your IT strategy and plan
Consider persisting everything (not sure I agree with this) – Hadoop clusters are cheap so you can store all the data you have. This may answer questions you don’t know in the future
Use the “store now and understand later approach” as you can just load the data instead of understanding the data ahead of time.
“Perfection can be the enemy of the good” – so much data is in Hadoop it will hard to get perfect. The data may not be reliable but that doesn’t mean it is not useful.
Other guidance is to choose the right architecture to save space and improvement performance.
The last one is to move processing down to the data
Figure 6: Source: SAP
The speaker advised not to make Hadoop simply a technology project
Figure 6 shows that you will want to understand how your industry is using Hadoop and don’t copy what others are doing. Consider insights to data and get cross-functional input.
Figure 7: Source: SAP
Bringing the data together to give you insight into your business and outside business
Figure 7 shows how Hadoop is part of the SAP platforms
Figure 8: Source: SAP
Figure 8 shows that HANA and Hadoop complement each other and summarizes the use cases
Related Information:
If you have a BI / big data story to share, we are interested in hearing from you at ASUG Annual Conference. Please seeYou’re Invited to Submit A Proposal for ASUG Annual Conference 2014 – Share Your Knowledge
You can go to sapbigdata.com