Democratizing Data with SAP Data Intelligence’s Self-Service Data Preparation
SAP Data Intelligence Metadata Explore is a self-service platform. It provides holistic data governance and data management across the entire company. Data discovery, publishing, profiling, lineage tracking, data quality rule measures and business glossaries document the data of each data source. Self-service data preparation enables data users to curate datasets with an easy-to-use interactive spreadsheet-like interface. Monitoring dashboards monitor the status of publishing, profiling, lineage, rules, and data preparation tasks. With SAP Data Intelligence Metadata Explorer, data users can promote data from raw to analytic-ready datasets quickly and on-demand. It democratizes data and reduces the time to actionable insight.
At the heart of SAP Data Intelligence Metadata Explorer is a Data Catalog. The Data Catalog is a central metadata repository. It stores the metadata for published datasets and provides an integrated, secure and end-to-end metadata layer across multiple systems and silos.
Data Catalog empowers data users to navigate, search, access, enrich and understand the context of data.
Publishing a dataset makes a local copy of the metadata into the Data Catalog and shares the data with others.
Tags provide some classification on dataset and helps search for datasets containing those assigned tags. Profiling on a published dataset will automatically tagging its matching columns by the tags within the built-in “ContentType” tag hierarchy. Data user can also create customized tag hierarchies and manually tagging dataset and its columns.
The business glossary provide a central and shared repository for defining terms. Business glossary terms provide definitions and context for published datasets. They promote a common, consistent understanding of business terms across the organization. After a term is defined in the business glossary, it can be assigned to a published dataset.
Term relationships are links to other terms, rules, rulebooks, published datasets, and columns. Aim to use the business glossary enterprise-wide or division-wide. Term relationships create connections that are stored and visualized as an enterprise knowledge graph. This graph can give a complete picture of the term’s relevance to the business.
Data users can connect dataset to the graph by editing the related objects for a term. This allows them to populate graphs automatically and find related data based on the defined business glossary.
The Fact sheet Relationships tab shows the business glossary terms, tags, and rules applied to the dataset.
The Fact sheet Reviews tab allows data user to rate, comment, and create a discussion about a published dataset.
Aggregating business metadata and tribal knowledge for the published dataset in the form of tags, business glossary terms, data quality rules, ratings and comments helps data users to find datasets and their relationships more easily.
Conducting and extracting lineage information on the dataset helps data users learn and understand the operational metadata like where the dataset is used and how it’s transformed.
All these information bring trustworthy to the dataset. With this solid understanding of the data context, data users can make well-informed decision on whether to use the dataset and how to use it in later data preparation.
Build upon the power of Data Catalog, SAP Data Intelligence Metadata Explore provides the following key capabilities:
Data Discovery & Governance
SAP Data Intelligence Metadata Explorer provides a centralized self-service portal and marketplace for data users to search and explore enterprise data from any and all sources.
Data users can use an interface similar to an E-commerce online shopping window to do a faceted search by specifying various filter criteria.
Fact sheet shows the detailed view of metadata information of the found dataset.
Data Quality Monitoring
One of the biggest sources of delay in analyzing data are quality issues, including inconsistencies, misspellings and false values.
With the understanding of the data quality issues found by the data discovery, and to proactively prevent low-quality data from undermining analytics, data users need to define data business rules, bind the rules to dataset and apply the rules to validate and measure the specific aspects of the dataset.
Data users can also create a data quality rules dashboard to reflect whether the dataset has passed the rules.
With the data and data quality rules in place, data users are now ready to focus on preparing the dataset to make it ready for analytic workload.
Data Preparation & Enrichment
With SAP Data Intelligence, data engineers can create data pipelines in SAP Data Intelligence Modeler to do the data preparation work.
SAP Data Intelligence Metadata Explorer also provides self-service Data Preparation to do code-free, agile data preparation. Business users and data scientists can access, transform, and enrich datasets using a spreadsheet-like user interface.
Use self-service data preparation to find data quality issues, correct and standardize data, and then output the data for analysis. This process improves efficiency and gains better business insights.
SAP Data Intelligence Metadata Explore democratizes data by empowering business users and data scientists to discover, access and prepare data in a self-service manner. It accelerates the delivery of trustworthy, actionable data and reduce the time to insight.