Skip to Content
Technical Articles
Author's profile photo Andy Yin

Democratizing Data with SAP Data Intelligence’s Self-Service Data Preparation

SAP Data Intelligence Metadata Explore is a self-service platform. It provides holistic data governance and data management across the entire company. Data discovery, publishing, profiling, lineage tracking, data quality rule measures and business glossaries document the data of each data source. Self-service data preparation enables data users to curate datasets with an easy-to-use interactive spreadsheet-like interface. Monitoring dashboards monitor the status of publishing, profiling, lineage, rules, and data preparation tasks. With SAP Data Intelligence Metadata Explorer, data users can promote data from raw to analytic-ready datasets quickly and on-demand. It democratizes data and reduces the time to actionable insight.

SAP%20Data%20Inte

SAP Data Intelligence Metadata Explore

Data Catalog

At the heart of SAP Data Intelligence Metadata Explorer is a Data Catalog. The Data Catalog is a central metadata repository. It stores the metadata for published datasets and provides an integrated, secure and end-to-end metadata layer across multiple systems and silos.

Data%20Catalog

Data Catalog

Data Catalog empowers data users to navigate, search, access, enrich and understand the context of data.

Catalog%20View

Catalog View

Publishing a dataset makes a local copy of the metadata into the Data Catalog and shares the data with others.

Publish%20dataset

Publishing dataset

Profiling analyzes the dataset and gathers the technical metadata to help data user to understand the data as well as its quality.

Profiling%20dataset

Profiling dataset

Profiling%20gathers%20all%20column%20statistics%20and%20show%20the%20result%20in%20a%20factsheet

Profiling gathers all column statistics and show the result in fact sheet

Tags provide some classification on dataset and helps search for datasets containing those assigned tags. Profiling on a published dataset will automatically tagging its matching columns by the tags within the built-in “ContentType” tag hierarchy. Data user can also create customized tag hierarchies and manually tagging dataset and its columns.

Auto%20tagging

Auto tagging

The business glossary provide a central and shared repository for defining terms. Business glossary terms provide definitions and context for published datasets. They promote a common, consistent understanding of business terms across the organization. After a term is defined in the business glossary, it can be assigned to a published dataset.

Business%20Glossary%20Term

Business glossary term

Term relationships are links to other terms, rules, rulebooks, published datasets, and columns. Aim to use the business glossary enterprise-wide or division-wide. Term relationships create connections that are stored and visualized as an enterprise knowledge graph. This graph can give a complete picture of the term’s relevance to the business.

Relationship%20Graph

Relationship Graph

Data users can connect dataset to the graph by editing the related objects for a term. This allows them to populate graphs automatically and find related data based on the defined business glossary.

Edit%20related%20objects%20for%20a%20term

Edit related objects for a term

The Fact sheet Relationships tab shows the business glossary terms, tags, and rules applied to the dataset.

Relationships%20consist%20of%20terms%2C%20tags%2C%20and%20rules%20applied%20to%20the%20dataset

Relationships consist of terms, tags, and rules applied to the dataset

The Fact sheet Reviews tab allows data user to rate, comment, and create a discussion about a published dataset.

Reviews%20help%20collecting%20tribal%20knowledge

Review and comment on a dataset

Aggregating business metadata and tribal knowledge for the published dataset in the form of tags, business glossary terms, data quality rules, ratings and comments helps data users to find datasets and their relationships more easily.

Conducting and extracting lineage information on the dataset helps data users learn and understand the operational metadata like where the dataset is used and how it’s transformed.

Lineage%20graph%20view

Data lineage

All these information bring trustworthy to the dataset. With this solid understanding of the data context, data users can make well-informed decision on whether to use the dataset and how to use it in later data preparation.

Build upon the power of Data Catalog, SAP Data Intelligence Metadata Explore provides the following key capabilities:

Data Discovery & Governance

SAP Data Intelligence Metadata Explorer provides a centralized self-service portal and marketplace for data users to search and explore enterprise data from any and all sources.

Data users can use an interface similar to an E-commerce online shopping window to do a faceted search by specifying various filter criteria.

Catalog%20search

Catalog search

Search%20result

Search result

Fact sheet shows the detailed view of metadata information of the found dataset.

Fact%20sheet%20shows%20detailed%20metadata%20information

Fact sheet shows detailed metadata information

Data Quality Monitoring

One of the biggest sources of delay in analyzing data are quality issues, including inconsistencies, misspellings and false values.

With the understanding of the data quality issues found by the data discovery, and to proactively prevent low-quality data from undermining analytics, data users need to define data business rules, bind the rules to dataset and apply the rules to validate and measure the specific aspects of the dataset.

Data%20Quality%20Rules

business rules

Data users can also create a data quality rules dashboard to reflect whether the dataset has passed the rules.

Data%20Quality%20Dashboard

Data quality rules dashboard

With the data and data quality rules in place, data users are now ready to focus on preparing the dataset to make it ready for analytic workload.

Data Preparation & Enrichment

With SAP Data Intelligence, data engineers can create data pipelines in SAP Data Intelligence Modeler to do the data preparation work.

SAP Data Intelligence Metadata Explorer also provides self-service Data Preparation to do code-free, agile data preparation. Business users and data scientists can access, transform, and enrich datasets using a spreadsheet-like user interface.

data%20prep

Data preparation using a spreadsheet-like user interface.

Use self-service data preparation to find data quality issues, correct and standardize data, and then output the data for analysis. This process improves efficiency and gains better business insights.

Summary

SAP Data Intelligence Metadata Explore democratizes data by empowering business users and data scientists to discover, access and prepare data in a self-service manner. It accelerates the delivery of trustworthy, actionable data and reduce the time to insight.

Assigned tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.