Data Quality Inspection by Data Quality Service
Data Quality Inspection (Image Source: SAP)
This article is intended to provide an overview of the Data Quality Service (Service ID: 50109586) scope option Data Quality Inspection and how it can be used to improve data quality.
Data Quality Service portfolio (Image Source: Own Image)
The goal of the blog post is to introduce the Data Quality Inspection offered by SAP’s Data Management and Landscape Transformation (DMLT) team as part of the Data Quality Service.
The post will briefly introduce the motivation for this service. It then presents the scope and delivery approach with a typical project schedule and deployment options.
Motivation and Value
If we look at companies in today’s world, one of their main challenges is data quality. Many customers have great analytics solutions, but if the data is not appropriate, the entire reporting is questionable at best. They struggle and spend a lot of time and effort trying to get the data that comes into the analytics world from all the operational systems in proper shape. While many of our customers know that there is something wrong with their data, such as incorrect, incomplete and/or duplicate data, very often they simply do not address these data issues for a variety of reasons. One reason is that they often don’t know where the source of the evil is.
Another value driver for data quality is the transition to SAP S/4HANA, which is of course a good thing, but the challenge is the quality of the data you. And that is what the huge truck here represents. Do you want to switch to SAP S/4HANA and put your data, potentially of poor quality, into your beautiful SAP S/4HANA world? Transitioning to SAP S/4HANA is a great opportunity to clean up the data, but before you can clean it up you need to identify those bad data and to understand how to clean it.
Transition to SAP S/4HANA (Image Source: SAP)
Data Quality Inspection provides value for example in the following use cases:
- Identify existing duplicates for customers and/or vendors prior to the CVI pre-project in a system conversion. Existing duplicates can be technically removed into a Golden Record by Data Quality Improvement prior to CVI and thus prevent “doubling” the data quality issue into Business Partner.
- During post-merger integration, process harmonization is an important task. To harmonize processes, harmonization of customizing and master data is required. By using Data Quality Inspection, you can identify differences in the configuration of processes and master data between the two solution architectures which are subject to post-merger integration.
- As of S/4HANA Release 2021, the customer business object as well as the Order-2-Cash process enable us to work with multiple and time dependent addresses. With this new capability duplicates which were created intentionally in the past will be obsolete. The standard implementation of “multi-address handling” is based on defining a golden record and a phase-out of the duplicates. This phase-out may be inconvenient for some industries or customers. With the help of Data Quality Inspection and subsequent Data Quality Improvement the multi-address capability can be implemented as part of a service in a “big bang” without any phase of activities.
The scope option “Data Quality Improvement for S/4HANA Conversion” and “Multi-Address Handling” will be addressed in detail in an upcoming blogpost.
Data Quality Inspection Scope
The Data Quality Inspection (DQ Inspection) is a system-based data analysis to detect data quality issues such as 1. duplicates and/or 2. data inconsistences and incomplete or incorrect data by pre-defined rules (provided by SAP). The process flow followed by the Data Quality Inspection Services was already described in the previous blog post.
Data Quality Inspection Process Flow (Image Source: Own Image
With the identification of duplicates, customers can significantly improve the state of the data such that for example intelligent analysis delivers improved results.When identifying duplicates, customers can flexibly choose which fields to use to identify possible duplicates.
Duplicate Identification Attributes (Image Source: Own Image)
After the scan, possible duplicates are displayed with a degree of similarity as a percentage.
Possible duplicates presented in the tool (Image Source: Own Image)
Data Quality Dashboard (Image Source: Own Image)
The Data Quality Inspection scope option provides a rich out of the-box content with 36 SAP business objects, such as profit center, internal order, or bill of material. Additional content is available on request (Industries, HR, ….).
Standard content coverage
In addition, there are in total 2000+ standard rules in the out of the-box content to inspect these 36 objects. Customers can request to adapt existing rules or add new rules.
This standard content (business objects and rules) leads to a short service ramp-up time.
Typical Project Schedule
The Services are following an agile approach with a flexible adaption to customer needs. The actual project duration is dependent on the level of standard content (already existing rules and objects vs. customer-specific rules and objects) and on the customer and their speed in processing activities. Typically, the Data Quality Inspection takes 4-8 weeks.
Data Quality Inspection project flow (Image Source: Own Image)
The service always starts with a kick-off session which is often combined with the scoping session. During this session, we jointly define the scope and determine how the requested target can be reached. The initial scope is defined in cooperation with SAP consultants and the customer’s data management team and related business line representative(s). The customer can pick and choose from the standard content objects and can add specific rules or other requirements if needed.
The Data Quality Inspection Service can be performed for both SAP and non-SAP systems. For SAP ABAP / NetWeaver-based systems (SAP ECC, SAP S/4HANA, SAP CRM, etc.), the customer is provided with an extractor via a note. The customer runs the extractor on one or more ABAP/NetWeaver-based source systems and provides the data to the SAP team. For non-ABAP / NetWeaver-based systems, the customer is responsible for extracting and providing the data in a common format (.CSV) containing all attributes required for data inspection.
The extracted data will be uploaded to a customer-specific SAP Business Technology Platform (SAP BTP) instance. Every customer gets an own instance. In this environment, SAP executes the rules and checks for outliers which maybe not apply to the customer’s data.
As an output, the customer will get access to a dashboard. Within the dashboard, the customers can drill down all the way to the record level. The customer has access to this dashboard for 2-3 month after the service delivery is done, so they have enough time to review the results. There is also a possibility to export the results.
The Data Quality Inspection scope typically include one inspection run. Nevertheless, it is also possible to perform the same analysis again, for example after improvement measures. This would of course be quicker than the first analysis, as the whole project is already set up.
Which tools are being used?
The Data Quality Inspection is a pure “as-a-service” offering. The service utilizes an expert tool developed by SAP for this specific purpose, the DQ Pod. This tool will be deployed via a combination of SAP Business Technology Platform (SAP BTP) cloud and on-premises components. On-premise components are delivered via transport requests and SAP notes. All components are included in the SAP service fee. Customers do not require separate product licenses or infrastructure investments.
DQ Pod Architecture (Image Source: Own Image)
There a several deployment options, depending on the customer’s needs. The Data Quality Inspection can be delivered 100% remotely or on-site if required. Typically, SAP is hosting a private cloud instance (CAL) for the duration of the project, which can be accessed by the customer. For customers with strict secure guidelines, the cloud instance can also be hosted by the customer and the SAP consults will access the instance remotely or offline on-site.
Deployment options (Image Source: Own Image)
Thank you for reading! I hope you find this post helpful.
The Data Quality Inspection is constantly evolving so solutions may be developed that are not explicitly mentioned in the above description, but still fit the context of this service.
If you have any questions or feedback, please leave a comment below this post. If you need more information about this blog or the service, please send an email to: mailto:firstname.lastname@example.org
Find more information and customer success stories on the website of Data Management and Landscape Transformation Services.
Use this page for more information about using SAP Data Management and Landscape Transformation (DMLT) services including FAQs.