SAP Datasphere and Partnerships – DataRobot
On March 8, 2023, SAP launched SAP Datasphere during data unleashed event as an evolution of SAP Data Warehouse Cloud to be the flagship next generation product of its data warehousing portfolio. Datasphere is a comprehensive data service built on top of the SAP Business Technology Platform (BTP) and is the foundation for the business data fabric designed to help businesses deliver business data with all its context and logic intact to the data consumers within an organization.
During the same event, SAP also announced four new strategic partnerships with industry leading data and AI vendors including DataRobot Inc. The conceptual framework for SAP datasphere and strategic partnerships is as shown in the following Figure 1.
DataRobot provides enterprises with a data science platform which helps data scientists experiment with and deploy ML models in production to drive better business outcomes. DataRobot is also one of the pioneers in introducing Automated Machine Learning at scale. In the Forreser Wave Q3/2022, DataRobot is seen as a Leader for AI/ML Platforms where Databricks holds the position of a Strong Performer. Its user-friendly interface makes it accessible to a wide range of analytical users including business analysts, citizen data scientist and expert data scientists. Efficient ML / AI backbone engine helps in significantly reducing time to value driven from the business processes data.
This partnership enables customers to take advantage of multimodal AutoML capabilities of DataRobot in conjunction with SAP Datasphere wherein Datasphere acting as a logical source of federated ML task data. ML task specific data may physically be residing in disparate data warehousing cloud and / or on-prem products. To summarize, SAP – DataRobot partnership brings together best of the breeds entities in ML and business processes spaces – state of the art multimodal machine learning capabilities harmoniously complementing extensive business processes knowledge to deliver effective business centric solutions.
Additional details, latest status, and a relevant vision video can be found at the following link.
Few of the key strengths of DataRobot platform which make it an industry leader in the Auto ML space include:
- Integration and exploration of multiple data sets from disparate data sources.
- Rapid automated feature engineering from the provided business process data for machine learning specific tasks.
- Automatically training and evaluation of multiple ML models in parallel during model development process to recommend best possible model that fit the needs of the task at hand.
- DataRobot provides for root cause analysis capabilities to better assist in business decision making processes.
- As far as ML Production is concerned, DataRobot provides a comprehensive set of tools to validate and govern deployed ML models.
- It also provides for an open and interoperable ecosystem for data warehouses, ML APIs, workflow tooling, BI tools, and business apps.
- DataRobot provides for a highly scalable, secure and robust deployment infrastructure with data protection measures in place.
Further details about the abovementioned technical capabilities of DataRobot can be found at the following link.
SAP – DataRobot Architectural Pattern
Envisioned interoperable architecture for DataRobot and SAP datasphere eco-system is as indicated in the following Figure 2. As can be seen from the following depiction, DataRobot AutoML and SAP portfolio ML / AI technologies can harmoniously operate in a complementary fashion to deliver best possible business centric solutions for our customers.
Above-mentioned interoperable architectural diagram outlines two paths to develop ML / AI based artifacts. A key difference between the two approaches is how the developed ML model is consumed for inference purposes. When using embedded SAP Datasphere ML / AI engines to develop desired ML model, Intelligent Scenario Lifecycle Management (ISLM) will be leveraged to embed developed ML model into the respective LoB and /or S/4 applications. On the other hand, when using DataRobot platform to develop desired ML model, developed ML model subsequently will be imported into SAP landscape leveraging AI Core and AI Launchpad technologies. Imported model then will be embedded into respective LoB and / or S/4 applications via Intelligent Scenario Lifecycle Management (ISLM) .
In the above depicted architecture, both citizen and expert data scientists, can access the same business process data to experiment and develop desired ML based solutions leveraging the tool of their choice. For instance, a citizen data scientist can access business process data and leverage AutoML capabilities of DataRobot to implement envisioned ML task which may include ad-hoc data analyses, ML model development, determining most influential features present within provided ML data set and root cause analysis for certain anomalies present in the business process data. On the other hand, an expert data scientist can leverage above depicted architecture for a quick exploratory data analysis, assess data predictive power of the provided ML data set and can shorten the turnaround time to create a baseline ML model by initially using DataRobot platform. Subsequently, knowledge acquired from experimentation with DataRobot platform could be expanded and improved upon by leveraging inherent embedded ML / advanced analytical engines within SAP Datasphere. Embedded advanced analytics engines provided within SAP Datasphere include Predictive Analysis Library (PAL), Automated Predictive Library (APL), Text Analysis and Geo-Spatial. Moreover, an expert data scientist could make use of Jupyter Notebook feature of DataRobot platform to directly access embedded advanced analytics engines present within SAP Datasphere.
Which of the two above stated options to choose when dealing with a ML experimentation and / or deployment use case when discussing an opportunity with a customer? This is a logical question that may arise after having read through the afore stated information. The correct answer is – it depends! To clarify, SAP Datasphere will be the data warehousing choice irrespective of the chosen alternative ML / AI implementation flow path.
As with any ML / AI centric customer engagement, envisioned engagement model would comprise of the following steps leading up to an optimal architectural decision around customer ML / AI centric journey.
- First step would be to Identify and Qualify advanced analytics centric ML / AI business use cases after a deep dive discussion / interview with the customer.
- After having identified potential ML / AI use cases, Recognize how many of the identified use cases are candidate for citizen data scientist skill level, i.e., wizard driven AutoML implementable use cases.
- If the customer implementation priority and need is for the citizen data scientist oriented use cases, then an approach leading with DataRobot will be a good starting choice.
- To jointly Position DataRobot and SAP Datasphere with embedded advanced analytical engines may fall into the following categories.
- Customers with a good mix of identified potential use cases which can be delivered by citizen and expert citizen data scientists.
- Customers with business analysts and / or citizen data scientists who are a willing partner in the intelligent enterprise vision and journey.
- Customers where in expert data scientists are looking for an automated ML / AI tool for a quick validation of their ML / AI model related hypotheses to subsequently use advanced algorithms to implement their initial understanding of the business process data.
By leveraging DataRobot ML / AI platform, available as a result of SAP – DataRobot partnership, in a complementary fashion with embedded SAP Datasphere advanced analytics engines, a very powerful set of ML / AI technologies-based solutions can be created in a relatively short time by the customers.
This blog post reflects my opinion and current perspective.