Bi-directional Data Exchange for Individualized Data Products in the Data Marketplace
In this blog, you will learn how data providers and data consumers can exchange data on an individualized basis due to the Data Marketplace technology and processes. If you want to learn more about the Data Marketplace for SAP Data Warehouse Cloud in general, check out the following overview blog post.
The entire process – from setup to operations phase for seamless and governed update – is demonstrated in the demo video below in 10 simple steps. The use case demonstrates a retail customer that runs SAP Data Warehouse Cloud wants to retrieve the amount of e-charging stations around his outlets from a Data Provider that has data about all charging stations in Germany. Instead of simply “throwing over” the entire data set, the data is enriched and provided as a private data product.
<20′ youtube video to be embedded once uploaded>.
The 10 steps – 7 for setup and 3 for operation – are shown on the below marchitecture where you can see how the data from the ultimate data consumer is shared as reference data to produce the individualized data on the provider side and share back the individualized product for the final data enrichment.
The 10 steps with some additional explanations are the following:
As a first step, the data consumer needs to make his internal data available to the data provider as reference data. If he wishes to do that in an integrated fashion via the Data Marketplace, he needs to have the data in a deployed view that he can use in the subsequent step in the Data Builder.
In this demo scenario, it is a master data view that contains store locations with initially 542 stores that have IDs below 300.000. For the demo, we increase the number of stores by actively filtering a larger set. In reality, obviously, this data would be updated based on changes in the source systems or the ETL processes.
Now the Provider can act as a data consumer and can load the reference data into his Data Factory Space to create the private data product by using the license key that the retail customer has provided. He can either deal with all data products in one space or have one space per customer – this has no cost effect and depends on the subsequent steps. He can use the Delivery Tracking to ensure that the data transfer job runs successfully. He also gets information on the number of records that have been transferred and that he now needs to enrich.
Now the individualization takes place. The Data Provider brings together the reference data – stores with geolocation data in form of longitude and latitude columns – with his database of e-charging stations in Germany that also have such geolocation data. For the demo setup we have chosen a straightforward approach of reducing the number of decimals of the geolocations to create 111x555m grids that we try to match. There are obviously more sophisticated options, e.g. leveraging the SAP HANA Spatial engine to calculate distances of the geo tuples.
After the actual creation of the data, the data provider can proceed with the listing. He is free to maintain the details based on the agreement with the consumer and context details the ultimate user needs. Most likely, the Data Provider will have a separate public data product that gives all information of the private product with sample data to try it out will this listing is only executing the actual one. The same way the consumer has done before, the Data Provider creates the License and a Release for the governed and transparent update process later on.
Everything is set for the most important step now. The consumer can load the individualized data product into his space of choice – in this scenario into his initial POS 360° space. The Delivery Tracking shows transparently that all 538 records have been shared back as not only the enriched records have been shared but also the ones where no charging stations have been identified. If part of the commercial agreement, the Data Provider could have also shared a second view with the same data product with the list of charging stations that have been matched.
Last but not least, the Data Consumer can use the entire data management functionality to process the data further in SAP Data Warehouse Cloud or use it with SAP Analytics Cloud or 3rd Party Tools. In this demonstration, we end with joining the enriched data back with the initial store data set to end with the additional KPI: Number of Charging Points.
OPERATIONS PHASE —————————————–
The data consumer and data providers are now wired together and if new data shall be enriched, the process can be easily be triggered in a transparent way. To simulate new available data, in this demonstration, we adjust the filter to all records from IDs below 300.000 to IDs below 500.000 which changes from 538 stores to 1.058. To request data enrichment for the new data, the consumer simply uses the Publishing Management to hand over the new records.
In the design time, the Data Provider has chosen the update mode “manual”. As a consequence, in the Delivery Tracking, he can see that the status of his activated Data Product is “Outdated”. He can now pull the new update to start his update process. Alternatively, with Update Mode “Immediate”, the new update could have been ingested directly after the release creation on the consumer side.
Now it also becomes apparent why for the Data Provider we have chosen the Intelligent Lookup as a harmonization environment. As the Data Marketplace has updated the records directly in the Input Table, the matching status changes and we now have 49% records for whom no matching has been conducted yet. After triggering a run of the Intelligent Lookup, we can see that we know have E-Charging Stations for 134 stores based on the matching rule.
Finally, with just one click – that could also have been made obsolete by choosing the Update Mode Immediately – the new data is ingested into the table that has been created with the initial activation of the Data Product. The join in the view now showed the 134 stores for whom the e-charging station data has been enriched.
In this demo setup, we have consciously chosen a highly governed process where both parties use publishing management to request new data or push new data respectively. In case you want to automate such an exchange with a live query setup of the reference data or also the data enrichment, reach out to us via mail to email@example.com.
While this was a demo scenario to explain the use case and supporting functionality and processes, you can actually check out the free “E-Charging Stations in Germany” Data Product by the “Open Data connected by SAP” Data Provider in the Data Marketplace. In case you are not an SAP Data Warehouse Cloud customer yet, just sign up for a free trial for up to 90 days right here.