Skip to Content
Author's profile photo Kolusu Suresh Kumar

Fusing Hadoop to SAP for Advanced Analytics:

Fusing of Hadoop with SAP may help customers with small installations to use Advanced Analytics in initial stage, while they can plan for HANA advanced analytics gradually.

General understanding of Archiving is to remove the data to different source to relieve the performance pains on SAP server. On moving, the data should be accessible the same way and should be able to restore it back if needed.

The basic idea to discuss about Archiving is that SAP ADK (archiving development kit) provides standard programs specific to business process/object to (re)move the data.

It is reasonably a good idea to use the Archive programs to move the data to Hadoop and this document covers only moving of data to Hadoop.

Few weeks back I attended OpenSAP Bigdata training and installed Hadoop windows version on the same host of SAP to test how SAP data can be moved to Hadoop for its use in Advanced Analytics, in parallel I performed Archiving too. So this document shows some Archiving screen shots.

The approach was simple make some batch files of hadoop commands and run them to ABAP code along with the archiving routines.  There are other ways too to run the batch files from SAP, but I preferred this.

Started the Hadoop server:

/wp-content/uploads/2015/10/1_821921.png

Most of the archiving routines are copied to a program to modify and make the custom transaction code to run the routines and move the data.

/wp-content/uploads/2015/10/2_821922.png

The execution of custom TC triggers the hadoop commands to transfer the data

/wp-content/uploads/2015/10/3_821923.png

After successful running of the code and commands the below messages of successful transfer and execution displays. I considered only one purchase order -4500017911 to be moved to Hadoop. So all the data of PO 4500017911 should be able to be exist in the Hadoop server which we would be checking in the later screens.


/wp-content/uploads/2015/10/4_821924.png

Header data of shows that it is moved from the table EKKO

/wp-content/uploads/2015/10/5_821925.png

Item data too shows that is too moved from the table EKPO.

/wp-content/uploads/2015/10/6_821926.png

Now taking PO 4500017911 would that is Archived(moved)

/wp-content/uploads/2015/10/7_821927.png

Logging to Hadoop server  to browse the files

/wp-content/uploads/2015/10/8_821929.png

Below are the file systems

/wp-content/uploads/2015/10/9_821930.png

PO 4500017911can be seen in the Hadoop File System

/wp-content/uploads/2015/10/10_821931.png

Let’s download it

/wp-content/uploads/2015/10/11_821932.png

I could be able to open it with notepad showing the Purchase order

/wp-content/uploads/2015/10/12_821933.png

This data can be used in advanced analytics. The same way Hadoop can be used as archiving datastore. 

Demo can be watch on this link : https://youtu.be/VO3EkIDbW64


Assigned Tags

      5 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Kolusu Suresh Kumar
      Kolusu Suresh Kumar
      Blog Post Author

      coments please....

      Author's profile photo Nico J.W. Kuijper
      Nico J.W. Kuijper

      Impressive demo!
      I have some questions. SAP Archiving is done based on ADK files. These files cannot be read outside the context of SAP systems. In the demo, the by the 'archiving process' generated files are text files that can be opened outside the context of SAP.
      My question would be:
      1) can these text files being read by the SAP archiving read program?
      2) in the demo the data is only copied to Hadoop (and this exist redundant) but it seems to me that the 'archived' data in Hadoop does not present the full business document.
      With classical archiving much more SAP data is stored in the archive for example the change documents belonging to the transactional data, etc. That is missing in this demo, so you probably didn't used the archiving object definition of tables/structures related to a business document (as to be found in transaction AOBJ)?

      3) SAP is working on standard integration with Hadoop using SDA (smart Data Access) and using NLS (Nearline storage) in the context of BW and in the future als Dynamic Tiering, so SAP is working hard on a hadoop integration on many layers. Did you had a look at that?

      Kind regards, Nico Kuijper

      Author's profile photo Kolusu Suresh Kumar
      Kolusu Suresh Kumar
      Blog Post Author

      Hello Nico J.W. Kuijper


      1) I have some questions. SAP Archiving is done based on ADK files. These files cannot be read outside the context of SAP systems. In the demo, the by the 'archiving process' generated files are text files that can be opened outside the context of SAP.


      That is true ADK files cannot be read out side of SAP context. 2 things I wanted to see when I plan to play with Hadoop archiving 1) is to see that both structured & un-structured (storing) archiving can be done for SAP. 2) Format independent, so that Hadoop services/components can consume these in their respective required format … may be for analysis... Searching something ….etc..etc.

        

      2) Can these text files being read by the SAP archiving read program?

               

                Initially when I started to play with Hadoop I planned to achieve the write, read,           retrieve & destroy. I was able to read them but found little time to document           (actually lazyJ ). Yes! It was possible for me to read it back.


      3) in the demo the data is only copied to Hadoop (and this exist redundant) but it seems to me that the 'archived' data in Hadoop does not present the full business document. With classical archiving much more SAP data is stored in the archive for example the change documents belonging to the transactional data, etc. That is missing in this demo, so you probably didn't used the archiving object definition of tables/structures related to a business document (as to be found in transaction AOBJ)?


      That’s a great observation! I have to use the custom read program as I found the standard program doesn’t update the indicators in my case. So, I have to cut short it. But if the standard program is rightly tackled it would move other data too.


      4) SAP is working on standard integration with Hadoop using SDA (smart Data Access) and using NLS (Nearline storage) in the context of BW and in the future als Dynamic Tiering, so SAP is working hard on a hadoop integration on many layers. Did you had a look at that?


      I am more into OpenText & ABAP, but every much interest in Hadoop…bigdata. I like to explore them Please help me  link you feel that are good explore.


      But this idea of moving or storing data when implemented well can serve as first step on Big data analytic's and can serve small & medium sized customers who can't implement Hana due to costs comparatively.

       

      Thanks for comments. Please share your ideas or ask any questions.

      Thanks,

      Kolusu



      Author's profile photo Kolusu Suresh Kumar
      Kolusu Suresh Kumar
      Blog Post Author

      Hello Nico J.W. Kuuijper,

         

      Thanks for pointing me to SDA, I had gone to documentation it gives an impression that it is more a pull than push of the data to analyze the data on HANA. My intention is somewhat different I guess unless I expressed it wrongly in blog.

      But thanks for those queries on which I decided to write following stages of life cycle soon.


      Enjoy! 🙂

      Thanks,

      Kolusu

      Author's profile photo VinnaKota Pavitra
      VinnaKota Pavitra

       

      Good idea