Data Integration & Data Source Connectivity in SAP Data Warehouse Cloud
SAP Data Warehouse Cloud helps us in building a stable virtual, business- and semantic layer across a heterogeneous, ever-growing and ever-changing system landscape. Technology does change rapidly, but the ideas, values and guidelines our analytics business is based on typically remain pretty stable and do only change very slowly. If they change, they do so only because our surrounding business changes, but never because our infrastructure moves to a newer technology.
Separating Semantics from Technology
Unfortunately, the last part of the above statement holds not always true. When there is a change in technology we use to run our daily business, we often make decisions concerning our analytics strategy based on the used technology. Technology influences the way we do analytics. But it should be the other way round, really, if not disconnected at all. The best-case scenario is an environment where our analytics business remains untouched from any technology changes and only needs adjustments if our real-world business environment demands it.
Therefore it is a key requirement to separate semantics (how is a specific KPI calculated?) from technology (how and where is my data stored?).
SAP Data Warehouse Cloud offers an environment where this is possible. With tools like the Business Builder, Graphical View Builder or SQL View Builder and Data Flows at hand one can create a semantic layer which stays independent of any change in the underlying data layer. Any analytics tool connected to the semantic layer is not impacted by changes to the data-managing basis.
Starting from the Ground Up
In order to establishing this separation, SAP Data Warehouse Cloud offers different ways how remote data can be integrated. Questions typically asked can be separated into these categories:
- Which systems can SAP Data Warehouse Cloud connect to?
- What artefacts / entities from connected systems can be consumed in SAP Data Warehouse Cloud?
- Which data access methods are supported?
- Which authentication techniques can be used?
- Which tooling is required for integrate on-premise data sources?
Connectivity Options in SAP Data Warehouse Cloud
With SAP Data Warehouse Cloud we follow different approaches in parallel to cover all the aforementioned points and to integrate your data.
Integrating data sources by pulling (Pull) data into SAP Data Warehouse Cloud is part of its out-of-the-box connectivity. The number of remote data sources supported by SAP Data Warehouse Cloud directly is growing over time and focused on different aspects like SAP and non-SAP data source support, hyperscaler connectivity, cloud and on-premise data sources. The SAP Road Map Explorer is receiving frequent updates with new and planned connectivity options for SAP Data Warehouse Cloud.
This option is available from the Connections section in your Space Management in your SAP Data Warehouse Cloud tenant. Connections created this way can be used in the different tools like SQL View Builder, Graphical View Builder and Data Flow to create and fill your data models. Data from the connected sources can be acquired live (virtual / federated, not talking about authentication metchanisms here yet ;)), replicated to or extracted into SAP Data Warehouse Cloud.
Connecting to on-premise sources (all sources not directly accessible on the public internet) requires the setup of either the Smart Data Integration (SDI) Data Provisioning Agent or SAP Cloud Connector or both, depending on which sources you need to connect to. These agents act like proxies or gateways for SAP Data Warehouse Cloud into your local network. Without these agents you would have to expose your systems to the public internet to allow SAP Data Warehouse Cloud to connect which is not what you want, trust me. 🙂
Generic SQL Connection Capabilities
However, we, the SAP Data Warehouse Cloud development organisation, do know that the number of data sources customers want to connect and integrate is so huge that we probably never, at least not as of now, can cover all these connectivity needs only with the natively integrated connection options shown in the above pictures.
Therefore SAP Data Warehouse Cloud additionally offers the possibility to let you push (Push) data with any third-party SQL client into your SAP Data Warehouse Cloud space. By creating so-called Database Users (SQL endpoints) in your space management overview, you can connect any tool which is capable of connecting to a SQL target. Pro tip: The Database User functionality can be used to consume data in a third-party application, too! 😉
This generic option to move data physically into your SAP Data Warehouse Cloud space is your gate-opener to integrating any data source into SAP Data Warehouse Cloud for which you can find a a tool which can connect to your source and can write data into a SQL-based target application. You can also build your own SQL client if you wanted to to move data into your SAP Data Warehouse Cloud tenant. 😉 Heads up: Soon SAP Data Warehouse Cloud add the Generic SDI Apache Camel JDBC adapter to its set of native connections. Stay tuned for another blog of mine explaining the beauty of this adapter to you and which benefits it brings to the data warehousing table.
Partner Connectivity Platforms
Another strategy we are pursuing is embedding partners as dedicated connection options into SAP Data Warehouse Cloud as well as enabling partners like Adverity, SnapLogic, APOS, Datazeit, Precog, Informatica, and others to write data into your space in SAP Data Warehouse Cloud using their connectivity platforms.
However, this option is not part of the actual SAP Data Warehouse Cloud offering and customers are required to license the partner solution in order to connect their SAP Data Warehouse Cloud tenant to the partner’s platform.
SAP Data Warehouse Cloud as of today offers Basic Authentication only when connecting to remote sources. Whenever connecting to a remote source which requires you to authenticate first before you can access its data, you have to specify the credentials at design time when creating the connection. Currently SAP Data Warehouse Cloud does not allow for creating real live connections as you may know it from SAP Analytics Cloud connecting to remote sources using Single-Sign-On and SAML assertions.
All the aforementioned connectivity options are represented by different components in the SAP Data Warehouse Cloud architecture. High-level speaking, components 2) and 3) are part of SAP HANA Cloud. Smart Data Access and Smart Data Integration as well as the Database Users components are wrapped by the SAP Data Warehouse Cloud application to offer the functionality to its users. The Data Flow component 1) is contributed by SAP Data Intelligence Cloud. Partners are the fourth component contributing to the rich connectivity of SAP Data Warehouse Cloud and connecting to the SQL endpoints of SAP Data Warehouse Cloud.
Live Data Access, Data Replication and Extraction
Connections created in your SAP Data Warehouse Cloud space can be used in the different modeling tools, but be careful: Not each and every connection can be used in both the Graphical or SQL View Builder and Data Flow. The Create Connection dialog in your SAP Data Warehouse Cloud space tells you which connection type can be used with with tool (check out my other blog for all the details).
The different tools are mainly built for two different purposes: Building virtual / federated data models by default (Graphical & SQL View Builder) with the option to persist the models if needed (see Data Integration Monitor), for example for performance reasons, and building your ETL processes for extracting, transforming and storing data in your SAP Data Warehouse Cloud space using Data Flows.
Both tools share a basic set of actions and transformations like join, union, calculations, but at the same time both tools come with unique functionality like SQL script as part of the two View Builders and Python scripting capabilities as part of Data Flows.
The various data integration & connectivity options combined with the different tools for data modeling allow you to establish a semantic layer which is only loosely coupled to the underlying data layer to establish stable, reliable and allows for non-disruptive changes to the data layer for any data-consuming application on top.
With the ever-growing connectivity capabilities in SAP Data Warehouse Cloud, you can make it your central piece for connecting your applications via a harmonised, virtual or materialised semantic and data layer to your heterogenous system landscape, hiding all the hard-to-understand and maintain specifics and edge cases from your frontend-facing applications.
Try it Out Yourself! Our Free 30-Days Trial Offering is Here to Help.
You can get yourself a free 30-days SAP Data Warehouse Cloud trial tenant with all the features enabled. Check out our free trial page here.
Let me know in the comments or ask your question in the Q&A area.