Data Ingestion Options in SAP Data Warehouse Cloud
The easiest way to get data into SAP Data Warehouse Cloud is to setup a connection to your data sources and retrieve the data from there. Though this is certainly the most common scenario, there are a few other interesting options on data ingestion covering different use cases, which I want to highlight in the following blog.
In addition, I have added a new chapter the Modeling Tutorial (https://github.com/SAP-samples/data-warehouse-cloud-modeling) , which is dedicated to the data ingestion topic (Exercise 4) and depicts in a practical manner how you can utilize the different options in SAP Data Warehouse Cloud:
Now, let’s have a closer look into the various options:
Pulling data via Connection
This is the classic and standard way to ingest data in SAP Data Warehouse Cloud. For that you setup a connection to the source system and then just pull the data in. SAP Data Warehouse Cloud comes here with various connectors and adaptors to SAP and Non-SAP sources. There is also an option to bring your own connector in order to access a specific source. Once the connection is set, you can read the data from the source system in two different modes:
- Data Federation (remote data access): keep the data in the sources and access them remotely. This approach will lower the cost of operations, since you will not consume any additional disc space. However, please be aware of potential performance penalties and security.
- Data Replication: copy and transfer the data physically into SAP Data Warehouse Cloud. This approach will utilize the underlying disc space in SAP data warehouse cloud. You will benefit from the performance since the data are stored in the local persistence. In the SAP Data Warehouse Cloud the underlying persistence is the SAP HANA Cloud.
Pushing data to Open SQL Schema
The Pushing Data approach is similar to Pulling data approach. The main difference is the trigger point, who initiates the data load process.
While in the Pulling Data scenario the initiator is SAP Data Warehouse Cloud itself, whereas in the Pushing Data scenario, the initiator is an external client which writes the data into a dedicated persistence of SAP Data Warehouse Cloud: the Open SQL Schema.
The configuration of an Open SQL Schema comes along with a database user with access rights to read and write into SAP HANA Cloud database.
By using the database user credential, you can access the schema with e.g. SAP HANA Database Explorer (which comes along with the SAP Data Warehouse Cloud delivery) or SAP Data Intelligence. With that you can execute standard SQL statements (DDL + DML), and integrate the artefacts later in the modeling. Off course, you can use any 3rd party client (e.g. DBeaver) or ETL tool of your choice to push data to the system.
Here are some other examples, how you can utilize the Open SQL Schema for different use cases:
- SAP Data Intelligence Cloud – File Upsert into SAP Data Warehouse Cloud
- HOW TO use the SAP Business Technology Platform to extend SAP Marketing Cloud with Predictive Scores
- Market Basket Analysis using Embedded Machine Learning Features in SAP Data Warehouse Cloud
Sharing data from Spaces
In general, in SAP Data Warehouse Cloud you can use Spaces to isolate resources such as CPU, disc spaces, users and data by grouping them logically to the business needs.
However, Spaces can also be used to share dedicated data and models to other spaces and make them available there. The shared data and models can be used, as if they were in the same space you are working. Hence, it is possible to reuse existing data and models from other spaces without the need of shifting and synchronizing data forth and back. A typical scenario is a central space for master data. The benefit is a much lower development effort and TCO.
Downloading data from Data Market Place
One of the latest feature in SAP Data Warehouse Cloud is the Data Market Place, where SAP partners and customers can offer data and models for download / purchase. For instance, that could be tailor made content for a specific industry or a market dataset which you require for your business scenario. The data market place allows you and others to exchange and monetize datasets, which can be easily integrated with your models in SAP Data Warehouse Cloud.
For more information on SAP Data Warehouse Cloud please checkout also the following blog:
With that, I hope I could give you an overview on the options to get data into SAP Data Warehouse Cloud system. These will cover additional scenarios and give higher flexibility for customers to integrate data.