Week 1: Automatic Data Governance with SAP Business Technology Platform
Summary. In this article of the series “Give Data Purpose Weekly” I share some insights on what technical data management capabilities support a successful data governance platform implementation. Obviously beside the technical side there is an organizational part as well – but let’s keep that for another article.
The past: Little did we have…
I remember the good old times where beside the big cinema we could rent a physical video tape in the video rental shop. What a feeling, opening the door, and being overwhelmed by the full shelves of VCR cassettes grouped and ordered by genres. We knew exactly which row to turn to find our favorite movie. That was easy, given the limited number of available videos. And if occasionally we could not find the movie we desired, we just asked the guy behind the desk, yes it was always a guy, who was able to point you to the right place within a split second. He had full overview and control. No one could pick a rated video without his approval. And today?
From manual gatekeepers to digital recommendations
The access to video changed completely. Online videos and streaming offerings grant access to a huge number of movies, documentaries, and concerts. This media lake nobody can oversee, neither visually nor manually. For god`s sake, modern technology allows us to access almost anything via platforms, as for instance Netflix, Amazon, Apple TV+, and YouTube. They all provide digital assistants to find the right offering for you, automating the information extraction, classification, recommendation for us based on intelligent technology and algorithms to manage the huge number of metadata available.
Does this somehow remind you as well of other scenarios? Where data – I mean a lot of data – needs to be provided in the right context to deliver value to the user? Your photo library on your mobile phone? Offering you stories, even automatically creating short videos with audio from your last vacation? How about your last online buying experience, where you received tailored offerings based on your recent buy or search?
All these scenarios are data driven. A lot of data is collected from different sources and combined with user interaction and due to the large amount of data automatically analyzed, classified, and categorized based on AI and machine learning algorithms to derive a value for the user from the available data – in other words: To #GiveDataPurpose.
Challenges of data driven organizations
What we experience in our private life are the same challenges that organizations experience today. The pure volume of available data and the low cost of data storage enabled almost every organization to collect every bit and byte of information. But this turns now from uninformed excitement about having all that data to informed disillusion on how to drive value out of the collected data.
How can I efficiently connect to all data? For some organizations the Data Lake or Data Swamp is the starting point to first put everything they have into an additional central data store – to have all in one place. This not only represents additional, redundant data storage, it also means that for the synchronization and replication of data additional effort is necessary. Pure copy and paste of data that has a lot of semantic meaning in the given context of the original application will also decrease the value of that data. Virtual and federated data access and integration leaves the data in the original context and semantics.
Katrin introduces recent update to the Integration Solution Advisory Methodology, and brand new off the press is our Integration Architecture Guide for Cloud and Hybrid Landscapes including a whole chapter on Data Integration.
How can I better understand my data? To get an understanding where in the different data sources relevant data is stored, data cataloging and data classification technologies that are crawling and analyzing the content of these remote sources is the important starting point. Leveraging artificial intelligence and machine learning provides even deeper insight into these „black holes of data“. For the classification of for example Person Identifiable Information not only individual words are relevant but also the surrounding context, artificial intelligence can help here automating the detection and classification like this happens in SAP Data Intelligence with Content Type Tagging or our supporting BigID Solution Extension.
In the openSAP Microlearning Expert SAP Data Intelligence series you can find several videos on Metadata Explorer topics like: Glossaries & Relationships, Publishing & Profiling or Data Lineage.
How can I manage to provide trustworthy data? Monitoring and improving then the quality of the available data according to the requirements of the end user is the next important level. Data needs to be „fit for use“ for exactly the purpose to be delivered or according to regulatory requirements to be fulfilled. Machine learning algorithms can help for example identifying outliners or hidden rules in existing data and suggest them as data quality rules.
Why not taking a look at an article from Wei Han documenting a show case on how to improve and monitor data quality using SAP Data Intelligence?
How can I ensure data is used by the right persons? And finally, when it comes to the usage and consumption of the data, necessary measures for managing the access to the data as well as the transparency on how data is moving across the data landscape is ensuring that only those users are playing with the data that they are allowed to see.
For sharing and provisioning data products from your data platform you can put an API Management layer as part of the SAP BTP Integration Suite around your platform to manage, govern and even monetize the data access. There is a nice Mission available on the SAP Discovery Center.
That sounds pretty like the video library story at the beginning. It is not that different. Organizations want to provide their users access to exactly the right trustworthy data out of the huge amount of available data for their job to be done. Like in the movie libraries, cataloging, classification, access management and usage management are key and need to be automated.
Next steps with SAP Business Technology Platform
With the SAP Business Technology Platform SAP provides all the necessary applications, tools, and services to build and orchestrate a powerful Cloud Data Management Platform that allows organizations to integrate, connect, improve, govern, and provide their data from any source to any consumer. Always knowing who is playing with their most valuable assets – Their data!
The SAP Discovery Center is a great starting point for understanding more about the individual SAP Business Technology Platform Services mentioned in the example architecture above:
And obviously beyond the “Build your own Platform” core services mentioned so far, the starting point can also be the SAP Data Warehouse Cloud offering, also running on the SAP Business Technology Platform. Take a look at the SAP Data Warehouse Cloud Tutorial for Developers.
Let me close for today with some final and my favorite “call to actions” for you:
- Understand the „12 ways to create business value from data with SAP Business Technology Platform” with our recently published Yellow Pages document.
- For a free and interactive experience with SAP join us for a #GiveDataPurpose Trusted Data Workshop where we can drill down deeper into the secrets of Data Governance.