SAP Data Management explained
In this blog, I try to provide an overview of the various SAP Data Management solutions around SAP Data Hub with their respective main capabilities and inter-dependencies:
- SAP Data Hub
- Data in Hub (DiH)
- Agile Data Preparation (ADP)
- Smart Data Integration (SDI)
- Smart Data Quality (SDQ)
- Data Services (DS)
- Landscape Transformation (LT)
- Advanced Data Migration (ADM)
- Information Steward (IS)
Since focusing on SAP Data Hub, this list is not complete and the way I explain the different solutions is of course biased by my own point of view, so please follow the links I provide for further information. Also, there is quite some overlap in the main capabilities of some solutions as I see them, and therefore I would always recommend an evaluation of which alternative would meet your requirements best. This matrix might be a good starting point:
Also, if you were interested in understanding SAP’s Data Management strategy first hand, you could watch my interview from TechEd Las Vegas 2017 with Ken Tsai, VP SAP Data Management about Managing Your Data Across Your Enterprise.
SAP Data Hub
When trying to understand Data Hub, one must work through the marketing message:
The architecture view:
As well as the product road map & vision:
And of course, the outlook of what it might become:
To summarize, in my humble opinion, SAP Data Hub is a data catalogue with lineage (pipeline and workflow) modelling, execution and reporting augmented with self-service (agile) data preparation.
It leverages SAP VORA for big data access as well as Smart Data Integration and Smart Data Quality for ETL capabilities and data quality through Agile Data Preparation.
Data in Hub
Data in Hub seems to be a SAP Digital Business Services offering leveraging Data Management & Landscape Transformation Services according to this 2017 SAPPHIRE NOW + ASUG Annual Conference session Harmonize Data and Prepare for the Digital Economy.
If I understand it correctly, Data in Hub is a modern HANA based state-of-the-art (extract, transform and load, aka ETL) data warehousing approach. For this it seems to leverage Landscape Transformation tools for data access as well as Data Services for ETL capabilities and data quality whereas HANA seems to provide the storage and transformation capabilities.
Agile Data Preparation
As I see it, Agile Data Preparation is an end user oriented self-service solution. It seems to mostly leverage existing capabilities from the HANA Rules Framework, Smart Data Integration and Smart Data Quality. If you were interested, I give a practical example of its workings in this blog: SAP Agile Data Preparation Tutorial.
Smart Data Integration and Smart Data Quality
If I am not mistaken, Smart Data Integration and Smart Data Quality are the beginnings of a HANA reimplementation of Data Services. While I do not believe that they are the full Monty yet, they work well if their features and functionality are sufficient.
Data Services is a classical ETL tool and exists in two incarnations, on-premise Data Services and SAP Cloud Platform Integration for data services.
Landscape Transformation is a set of tools and technical procedures to technically support business transactions in the areas of:
- Leverage Sell, Buy, and Restructure
- Consolidate and Reduce IT Cost
- Unify and Transform Data
If you wanted to try it yourself, there is my respective blog: How to make SAP Landscape Transformation 2.0 (SAP LT) work for you 101.
Advanced Data Migration
Advanced Data Migration is a process oriented data migration tool especially also in the realm of SAP S/4HANA. Its aim is to involve the data owners into the migration process by providing them both with business oriented insights into the migration process as well as tools to improve the data quality.
If you wanted to get into some more detail, there is this blog series of mine: Load your source data into your target system with Advanced Data Migration.
Information Steward is a passive data governance tool to monitor, analyse, and improve data integrity.
Good blog. However, in your table comparing the capabilities of the different solutions, you listed SAP Data Hub having "state of the art" data lineage in the same league with SAP Information Steward's data lineage.
Q. I have not seen SAP Data Hub's data lineage capabilities at all so how did you arrive at SAP Data Hub's data lineage offering data lineage as capable as SAP Information Steward's data lineage?
Good approach, I like to add my 5 Cents:
We need to differentiate marketing from technical aspects.
Take SDI and SDQ: They used to be based on XSC, up until now they are not 100% migrated to XSA.
Data Hub seems to be based on XSA and Kubernetes, which seems to be the new development framework for data applications from SAP. XSA is a local cloud foundry solution. So in fact you have another 2 levels of seperation or containerization/micro services.
Same architecture as HANA Vora 2.0.
It is completely unclear to me whether SAP will migrate SDI, SDQ, or the Data Warehousing Foundation, to that architecture as well and how they will manage that with licences.
As of now Data Hub seems to be mainly based on marketing. It took years to get SDI and SDQ to an acceptable quality level. And still, functionality seems below Data Services. To what extent Data Hub will superseed SDI, SDQ and many other solutions remains to be seen. Roadmap for SDI/SDQ, or Data Warehousing Foundation, seems not that ambitious. So it might be that developer focus shifted.
By the way, I am surprised that Data Warehousing Foundation DWF is not mentioned. It contains quite some scheduling functionalities as well.
This piece was really helpful.