Extend the Power of Data for SAP RISE Customers: data federation with SAP in multi-cloud GCP
updated date: 30.Aug.2023
More and more customers use a multi-cloud strategy to harvest the wealth of services and data sources available from each hyperscaler (Microsoft Azure, Amazon Web Services, Google Cloud). Complementing your SAP information with relevant data from the proper cloud provider helps generate data insights and foresight. However, it becomes essential to understand the strengths of each cloud provider and know how to use their tools and processes. Hence, we are doing a series of blog posts for RISE with SAP Private Cloud Edition customers combining their own hyperscaler subscriptions.
This blog post focuses on GCP (Google Cloud Platform).
Link for AWS: here.
Link for Azure: here.
For RISE with SAP customers, in multi-cloud environments, having their own Google Cloud organization enables customers to access a variety of data sources via Google Analytics Hub, like Google Trends, weather or sustainability data consolidated on the planet-scale data warehouse BigQuery. The same is true for applications like Google Ads, Google Marketing Platform, and services around Data Analytics, Machine Learning and Artificial Intelligence including Generative AI. The recent announcement about Datasphere (SAP side) / BigQuery (Google side) interoperability opens up new avenues for accelerating business outcomes based on connected insights across SAP’s, Google’s and other datasets.
1. Value Proposition
1.1. Data Interoperability
With Google Cloud, you have great data interoperability. There are two types: either you make a copy of your data, also called “export” or “replication”, or you use the data from its original place. The latter is what data federation is about. Now, with the recent announcement about Datasphere/BigQuery interoperability, this functionality becomes available for accessing Google data directly from SAP Datasphere. A reference architecture could be found in SAP discovery center: “Businesses can enable richer insight by integrating data from SAP Datasphere with data in Google BigQuery services. With this, the data stays in source systems and queries are federated via virtual tables in SAP Datasphere. Data is never cached or replicated from its source.”
As stated in the previous blog, ‘Unlock the Power of Business Data for SAP RISE Customers: Mastering Data Management and Cultivating Insights‘, using SAP data management solutions on top of business data generated in SAP landscape has its unique value, especially in cases with regard to currency conversion, hierarchy, derivation, time dependency master data, and so on. For instance, corporate-finance-related planning, analytics, visualization, and machine learning.
With that being said, the data generated from SAP ERP & CRM will lose context outside of the SAP landscape. Meaning that when data is exported out of SAP Landscape, you will have to rebuild the context in the target data lake using various toolings like Join, Aggregation, Calculation, and others to mimic SAP’s Application Logic. This proves to be extremely difficult and expensive to manage given the complexity.
To give a specific example, data warehousing approaches, as shown in Fig. 1, flow 1+2+3 is recommended, in comparison to flow b. Because, flow b might look more direct, but will lead to heavy loads of redundant data engineering, which includes first rebuilding the context, then further massive transformation, aggregation, and summarization. Then, in the end, flow b will not only jeopardize the quality of SAP business data but also incurs massive redundant engineering and maintenance efforts.
Fig.1 data warehousing network flow
1.2. Google Cloud as Data & AI Backbone for SAP Landscape in Multi-Cloud Setup
In multi-cloud environments, Google Cloud Platform (GCP), for instance, allows RISE with SAP customers to participate from the world’s knowledge and the wealth of services available in the cloud. As of the blog updated date, Google Cloud services are available in 37 regions, 112 zones, 187 network edge locations, and 200+ countries and regions.
Google is widely known for its leadership position and comprehensive strong capabilities in data analytics, AI/ML in both research and productization. Google Cloud offers a comprehensive end-to-end data platform with analytics services and Vertex AI, a native AI/ML platform that also provides advanced Generative AI (GenAI) capabilities. Services like BigQuery, Dataplex, Dataproc, Dataflow, Looker, enable the processing and analysis of large amounts of both structured and unstructured data. Vertex AI offers fully managed ML tools within a single, unified workflow. Google Generative AI App Builder (Gen App Builder) enables developers, even those with limited machine learning skills, to leverage Google’s foundation models, search expertise, and conversational AI to build chatbots and search engines for websites and across enterprise data.
In addition, Google also provides Google Trends Data and a vast library of public and commercial data sets, which can be easily consumed in Google BigQuery. Through Google Model Garden in Vertex AI, users can choose a model that fits their needs best. A bigger model isn’t always better and Google provides the flexibility to choose from first-party multimodal models from Google, open source models as well as third-party models.
1.3. SAP Datasphere as the Data ‘Bridge’ for SAP Landscape to Multi-Cloud
Most RISE customers have legacy large investments in their SAP BW or SAP BW/4HANA. SAP Datasphere provides Data Marketplace for data exchange and many more system connectivities in multi-cloud environments. Datasphere will become the dominant data warehouse for SAP business data in the future. Nevertheless, to provide SAP customers a smooth transition and to protect customers’ legacy investments, SAP provides BW Bridge enabling the customers to have SAP Datasphere as an extended data staging layer to the multi-cloud environment. With this extended layer, customers can do data federation from BW/4HANA or BW to Google BigQuery through SAP Datasphere.
For more information about Datasphere and how it complements BW/4HANA, please check this previous blog post.
Fig.2 System Landscape
2. Data Management Solutions Review
For SAP data management solutions, we did a review in this blog, ‘Unlock the Power of Business Data for SAP RISE Customers: Mastering Data Management and Cultivating Insights‘. We followed the flow of how business data is generated in the SAP landscape, then how it is stored. Based on that, for analytical purposes, how could ETL jobs be done, and what is the approach to do BI, ML, and AI.
Below, we also do a review of Google Cloud data management services on customers’ own Google Cloud subscriptions. We will explain what the service essentially is based on Google documentation. An in-detail overview of Google Cloud AI/ML services guidelines could be referenced here. In addition, we will emphasize how these services can be integrated with SAP data management.
|Service Name||Service Type||Remarks|
|Cloud Data Fusion||ETL||
|Dataflow||ETL & Big Data||
|Dataproc||Data Lake / Big Data||
|Google Cloud Databases||Database||
|Google Cloud Storage||Data Lake||
|Databricks on GCP||AI/ML||
3. Enhanced Capability in Multi-Cloud for RISE Customers with GCP
By having their own GCP subscriptions in multi-cloud environments, RISE customers can be empowered with enhanced capabilities. Below we list some phenomenal advantages.
3.1. working with multiple hyperscaler providers without overheads
Many organizations store data in multiple public clouds. Often, this data ends up being siloed, because it’s hard to get insights across all of the data. Customers want to be able to analyze the data with a multi-cloud data tool that is inexpensive, fast, and does not create additional overhead of decentralized data governance. By using BigQuery Omni, Google Cloud reduces these frictions with a unified interface, with key benefits in cost, security and data governance, serverless architecture, ease of management, and cross-cloud transfer.
3.2. Generative AI
On 7 June 2023, Google Cloud announced the general availability of Generative AI support on Vertex AI, giving customers access to their latest platform capabilities for building and powering custom generative AI applications. With this update, developers can access their text model powered by PaLM 2, Embeddings API for text, and other foundation models in Model Garden, as well as leverage user-friendly tools in Generative AI Studio for model tuning and deployment. Backed by enterprise-grade data governance, security, and safety features, Vertex AI can make it easier than ever for customers to access foundation models, customize them with their own data, and quickly build generative AI applications.
As with the entire Cloud portfolio, Vertex AI and Gen App Builder help give customers complete control over their data; it doesn’t need to leave the customer’s tenant, is encrypted both in transit and at rest, and is not shared or used to train Google models. Google rigorously evaluates their new models to ensure they meet their Responsible AI Principles, and all of their generative AI offerings include the user security, data management, and access controls Google Cloud customers have come to expect.
3.3. Federated Machine Learning
Federated learning is a machine learning (ML) technique that enables a group of organizations, or groups within the same organization, to collaboratively and iteratively train and improve a shared, global ML model. By doing federated machine learning on Google Cloud, while combining the business data in SAP landscape and the data in Google Cloud landscape.
3.4. Integration with Enterprise Productivity Suite – Google Workspace
SAP and Google announced seamless integration between S/4HANA and Google Workspace, allowing S/4HANA users to leverage the collaboration capabilities in Google Workspace within their SAP business processes. For example, Business users can collaborate and align on a list of Journal Entries for Mass Upload into S/4HANA, or Power users can download data into Google Sheets to then inform operational analytics or perform one-touch translation into an executive update in Google Slides.
Google Workspace, as a Google-native productivity suite, provides seamless integration with Google Cloud services. In data management area, Google Workspace can change how enterprise users interact and leverage data insights at work, and become more productive and intelligent.
4. Some Reference Architecture and Use Cases
Based on the SAP announcement on the partnership with Google Cloud, with SAP mainstream solutions being considered, we designed some reference architecture and use cases for RISE with SAP customers in their multi-cloud environment with their own GCP subscription.
4.1. Predictive Sustainability Footprint Management
As shown in Fig. 2, RISE customers can combine business data from SAP Sustainability and data sets on Google Cloud Data Marketplace. By leveraging ML/AI capability, customers can predict short/mid-term carbon footprints and adjust their ESG strategy.
Customers can consume data sets like weather data, geographic data, and emissions data, and store them in Google Cloud BigQuery by federating SAP Sustainability Footprint Management Analytics data stored through SAP Datasphere into BigQuery. Then Google Vertex AI will be used to predict short/mid-term carbon footprint.
Technical Material List:
4.2. Generative Sales Opportunity for SAP CX with Google Workspace Integration
As shown in Fig. 3, RISE customers can combine business data from SAP CRM and internet data from Google Cloud, generate sales opportunities and write them back to SAP CRM.
Customers can use Google Trends data that is based on worldwide internet search requests to determine which of their products are trending. For instance, a RISE customer whose business includes producing beverages, to predict beer sales, can use Google Trends data.
A Cloud Composer job could be created to ingest the data into BigQuery – Google’s petabyte-scale enterprise data warehouse. BigQuery can also federate SAP CRM data via Datasphere. Then with Cortex Framework, Google Vertex AI will be used to predict future sales and generate sales opportunities notes.
To consume the generated sales opportunities by Google Cloud Vertex AI in SAP landscape, BTP integration suite can be used to integrate the API by Vertex AI and Cloud for Customers API. On top of the result, SAP Analytics Cloud can be used to visualize the result. And the result could also be integrated with Google Workspace by adding an agenda to the corresponding AE to review the generated sales opportunities notes.
Technical Material List:
4.3. Use cases from the community
|Use Case/Reference Architecture||Link|
|Faster retailer success with Google Cloud Shelf Checking AI and Cortex Framework with SAP||Google Blog|
|Drive business innovation using SAP and Google Cloud Data Platforms||SAP Blog|
Google Cloud is a great way to complement SAP functionality as part of a multi-cloud strategy. Especially data like Google Trends, Weather or Sustainability data can create new business insights for SAP customers. This data can be easily made available from Google BigQuery, Google’s planet-scale enterprise data warehouse, directly to SAP Datasphere. But it does not stop there – many other global data sources, as well public as chargeable, can be made accessible via Google Cloud Analytics Hub. Even more becomes possible when using Google’s AI/ML capabilities for business predictions or customer classifications.
This blog post is a joint work between SAP RISE CAA Reference Architecture group and Google Customer Engineer team, using Google Docs. Google Docs is an online editing platform, that allows users across organizations to collaborate on any device in real-time, and it’s part of Google Workspace.
- The blog content does not necessarily represent the official opinion of SAP or Google Cloud. The opinions appearing in this blog are backed by SAP or Google documentation which can be revealed in the corresponding reference links.
- SAP takes no responsibility for managing and operating customers’ own data center, nor for customers’ own hyperscaler subscription
- SAP notes that posts about potential uses of generative AI and large language models are merely the individual poster’s ideas and opinions, and do not represent SAP’s official position or future development roadmap. SAP has no legal obligation or other commitment to pursue any course of business, or develop or release any functionality, mentioned in any post or related content on this website.
Acknowledgment to contributors/reviewers/advisors:
*knowledge is meant to be shared, and copyright matters
Ke Ma (a.k.a. Mark), co-author, Senior Cloud Architect, RISE Cloud Advisory RA group
Special THANK YOU to Google Cloud colleagues who co-authored/contributed to this blog:
Thorsten Staerk, co-author, Customer Engineer, Google
Adrian Beheschti, Customer Engineer, Google
Vadim Zaripov, EMEA Data Analytics Lead, Google
Blake Lanning, Customer Engineer, Google
Jesper Christensen, Customer Engineer, Google
Michael Harding, Partner Manager, Google
Osmar Vinci, Customer Engineer, Google
and the Google Cloud SAP Customer Engineer team
Michael Truong Ngoc, Machine Learning Engineer, SAP IES AI CoE
Murad Mursalov, Cloud Architect & Advisor, RISE Cloud Advisory
Sven Bedorf, Co-head of Cloud Architecture & Advisory, RISE Cloud Advisory, MEE
Kevin Flanagan, Head of Cloud Architecture & Advisory, RISE Cloud Advisory, EMEA North
Daniel Temming, Co-head of Cloud Architecture & Advisory, RISE Cloud Advisory, MEE
Luc DUCOIN, Cloud Architect & Advisor, RISE Cloud Advisory
Richard Traut, Cloud Architect & Advisor, RISE Cloud Advisory
Frank Gong, Digital Customer Engagement Manager, SAP ECS