SAP Datasphere – Q&A and Partnerships
On 8th of March 2023, SAP launched SAP Datasphere and on the same day already renamed the SAP Data Warehouse Cloud tenants to the new name.
SAP Datasphere shall deliver a Business Data Fabric to offer a seamless access to your data, independently where it is stored.
In the session a lot of questions where asked and answered by different SAP employees (thank you for this dialog!) about what this means and what is the impact to other SAP solutions. I compiled the most important questions I have seen from the session in the following part 1. After that I want to write down my first thoughts and considerations about the new strategic partnership.
Part 1 – Q&A:
Here, just updated is the external SAP Datasphere FAQ.
—– General Questions —–
Q: Will the Datasphere eventually replace DWC?
A: SAP Datasphere is the evolution of SAP Data Warehouse Cloud.
Q: What does evolution mean when talking about SAP Datasphere is the evolution of SAP DWC? Will DWC be renamed/rebranded or will they co-exist?
A: It is a rebranding. SAP DWC will become SAP Datasphere.
Q: What is the difference/advantage of SAP Datasphere over SAP Data Warehouse Cloud?
A: SAP Datasphere is the evolution of SAP Data Warehouse Cloud. It includes many new features include a global data catalog, deep partnerships, and an advanced analytic model.
—– SAP Data Intelligence —–
Q: Will you provide a clear roadmap for SAP Data Intelligence today?
A: SAP Data Intelligence Cloud will continue as a supported product with ongoing innovation and investment.
Q: Is SAP Data Intelligence embedded within Datasphere ?
A: SAP Data Intelligence Cloud will continue as its own product but many of the capabilities will be included in SAP Datasphere as well. The underlying engines are the same (that handle data movement) and in that sense yes it is embedded. However there are still differences in the two solutions: SAP Data Intelligence Cloud is fully dedicated infrastructure per customer whereas SAP Datasphere is a true multi-tenant shared infrastructure solution (for example).
Q: Will all features of SAP Data Intelligence be included in SAP Dataspehere? Or what would be a reason to have both products?
A: SAP Data Intelligence Cloud continues as its own solution. We intend to have SAP Datasphere and SAP Data Intelligence Cloud co-exist until SAP Datasphere supports all SAP Data Intelligence Cloud customer use cases. The plan is that SAP Datasphere will eventually be able to cover all the major capabilities, target systems, and use cases that SAP Data Intelligence Cloud provides. We plan to also provide tools to facilitate the technical transition.
—– Integration Aspects —–
Q: SAP Datasphere is purely federated or still requires persistent data in the cloud data model when combining different data sources for Analytics?
A: SAP Datasphere follows a federation first approach – meaning you leave data where it resides, build your models and later on decide whether you want to replicate full tables, create view persistencies to cater for source system workload, data egress, and performance towards the end user.
Q: What mechanism type that DataSphere is using for realtime replication?
A: SAP Datasphere will support a cloud native real time replication mechanism (trigger based) which allows to efficiently replicate large data sets. You will find this functionality as “Replication Flow” as part of SAP Datasphere.
Q: How are you planning to enable seamless integration with on-premise solutions? DP Agent will be there for the long run?
A: For SAP S/4 and ERP systems we will make use of DMIS (part of SAP Landscape Transformation) and CDS views as a way to integrate data in near real time with initial and delta replication. For other on premise solutions we will initially use DP Agent but over time our intent is to enable seamless integration without an on premise agent being required.
Q: Is Datasphere replication able to move on prem data to on prem targets without a roundtrip to the cloud?
A: Our main focus of SAP Datasphere is to replicate data into cloud and distribute it further. We are planning for hybrid scenarios where workload could be executed e.g. on premise while orchestrated in the cloud to avoid the roundtrip.
—– BW/4HANA & SAC —–
Q: Where does SAP BW/4HANA fit into the SAP Datashpere?
A: SAP BW/4HANA objects and other objects can be imported via SAP Datasphere, BW bridge that lets users access their data and models via a workspace within SAP Datashphere.
Q: How will these products interact with SAP Analytics Cloud?
A: SAP Datasphere is tightly integrated with SAP Analytics Cloud to support analytics and planning use cases. We intend to further strengthen the integration with the release of the Analytic Model in SAP Datasphere. The Analytic Model offers a multi-dimensional modeling experience and comes with powerful new features, such as calculated and restricted measures, exception aggregations and the pruning of attributes and measures.
Part 2 – Strategic Partnerships:
SAP announced four strategic new partnerships to support the idea of a Business Data Fabric and to better support the integration of non-SAP data in a unified usage context.
Databricks was founded by the creators of Apache Spark and they build a complete ecosystem around, delivering the Data Lakehouse based on a multi-cloud strategy similar to SAP. In simple terms a Data Lakehouse can be understood as bring together modern file formats (Apache Parquet (Databricks), Apache ORC, Apache Avro) with open table formats (Delta Lake (Databricks), Apache Iceberg, Apache Hudi) with a powerful, distributed query engine to process data (Photon for Databricks).
The advantage of a Data Lakehouse is to have all kind of data (structured, semi-structured, unstructured) in one tier and serve all your data roles like data engineer, data analyst, data scientists, BI modeler and so on, from this tier.
Databricks coined the term “Data Lakehouse” and is the one top partner in this area, even if others provide Data Lakehouse technologies, too.
If you look at market research like from Forrester or BARC for Data Catalog, Data Intelligence or Metadatamanagement, Collibra is typically on of the top three solutions in this area (typically together with Alation and Informatica or IBM).
SAP have some history with metadata management and already delivered these e. g. with SAP Information Steward, SAP Power Designer and also in other data and analytics solutions. SAP offers Data Catalog functionality within SAP Data Intelligence and build up more and more capabilities within SAP Data Warehouse Cloud.
So in a world where data assets are distributed all over our company and an overview and understanding of our data is getting more and more important, a data catalog is clearly recommended and will become a cornerstone of the data culture within data driven companies.
Collibra lately expand its platform with data quality management and data observability capabilities together with an partner ecosystem.
See also: SAP Datasphere & Partnerships – Collibra
Confluent, similar to Databricks, is a company build on another important open source software for data management – Apache Kafka. If you have streaming data in your company, you will not pass having a look on Kafka. Confluent delivers Kafka from the cloud as a service with an optimized ecosystem.
For data driven companies the speed of collecting and processing data in near-real-time is getting more and more important. If you search the SAP Community you will find, that Kafka is a regular topic here, too.
DataRobot is a pioneer of Automated Machine Learning and is deliviering a broad AI platform today. In the Forreser Wave Q3/2022 DataRobot is seen as Leader for AI/ML Platforms where Databricks holds the position of a Strong Performer.
This was maybe the most surprising partnership as I have seen SAP on a good way expanding it’s AI capabilites based on HANA, Augmented Analytics (e. g. via APL or SAP Analytics Cloud predictive features), or the relative new offering SAP AI Core. But here we also see that these tools and services are mostly used together with SAP solutions. So to expand to non-SAP data and use cases this could be the right way.
More about the current state of this partnership can be read in this statement.
If I look into my SAP Datasphere tenant today (formerly SAP Data Warehouse Cloud), I just see the announced new features (Analytic Model, Catalog, Replication Flow). More will come on the partner side as on SAP side. Even if integration today is already possible, more is expected as the comments in the blog “Unified Analytics with SAP Datasphere & Databricks Lakehouse Platform- Data Federation Scenarios” shows.
I remembered the announcement of SAP Data Hub (now SAP Data Intelligence) where SAP announced also openness to other vendors and partnerships and a very similar vision (just from my memory). In a world where end user companies are less and less bound to vendors because of the need to make the best out of their data, openness and partnerships are essential. To transfer capabilities from SAP Data Intelligence into SAP Datasphere will be the right decision as SAP DI will bring in further capabilities essential for a real data fabric.
SAP choose and start with top vendors in the market which seems to be the right approach. I would be happy if this works out well and also opens the mind of internal SAP-only advocates in the area of data and analytics. SAP is still the top in creating and handling business data but the process side is different to the data side. SAP have a big footprint in many companies but it is typically not the one player. Data Warehouses have been there to solve this in the past. In this hyper fast hybrid world today approaches have to be evolved and a Business Data Fabric – done right – shows the right way.
Happy to hear what you think about and if you already have some experiences how the solutions form these partnerships plays togehter with SAP.