Custom Data Quality Reporting with the Failed Data Repository
The Failed Data Repository
If you are looking to build some custom reports on the results of your data quality assessment – beyond what is available with the Information Steward’s Data Quality Dashboards – you can leverage the Failed Data Repository as the database to meet your custom reporting needs.
The Failed Data Repository provides you information about failed data from your validation rules within a supported relational database system. Information includes:
- Information about the project, connection and tables which generated the failed data (IS_FD_TABLES_MAP table)
- Execution history of all the tasks which generated the failed data (IS_FD_RUN_HISTORY table)
- All failed rules for a given run (IS_FD_RULE_INFO table)
- All the rows that failed one or more rules (<table_alias>_FD table)
- All the rules which failed a given row (<table_alias>_FR table)
For detailed information about the above referenced tables, see the section on “Accessing additional failed data information” in the Information Steward User Guide. The diagram below shows the relationships between the failed data tables.
As an example, the total rows that were validated during the run is available in the IS_FD_RUN_HISTORY table (TOTAL_ROWS column) for each IS task. And, you can join the *_FD tables to get at the failed data counts per rule/task.
Setting Up the Failed Data Repository
To leverage a Failed Data Repository for custom reporting, you must first establish a connection to the database within the Central Management Console (CMC). Ensure that the connection type is “For data that failed rules.” The image below shows an example of the connection parameters, many of which will change depending on the Database Type selected. For the most current listing of supported databases, please check out the SAP Information Steward PAM.
Specifying the Failed Data Repository
When executing a rule or set of rules (per task), you can select to save the failed data to one of the Failed Data Repositories that you have previously configured.
Viewing Failed Data from Information Steward
This is the Data Quality Scorecard Detailed View, from here you can view the failed data:
This gives you the Failed Data screen. Once you have set up the Failed Data Repository, this will give you access to “View More Failed Data” to get beyond the 500 record sample data size.
The Information Steward Repository Views
Although the Information Steward Repository is not a supported means to exact data from custom reporting, here are a few Information Steward Repository Views that may contain some additional information to meet your needs:
- MMB_Key_Data_Domain
- MMB_Key_Data_Domain_Score
- MMB_Key_Data_Domain_Score_Type
- Key Data Domain, Quality Dimension, or Rule level
- MMB_Domain_Value
- Quality Dimension descriptions
- MMB_Rule
- Rule definition/description
- MMB_Data_Group
- Project Names
Related Content
How to Create Detailed Failed Data Reports as Part of your Data Quality Analysis
- How to use SAP BusinessObjects Information Steward, along with the SAP Business Intelligence platform components SAP BusinessObjects Information Design Tool and SAP BusinessObjects BI launch pad, to produce a Web Intelligence report that will help you analyze the quality of your data.
SAP Information Steward Failed Data ReportingInformation Steward Failed Data Reporting
SAP Information Steward: DQ Reporting of the SAP HANA Live Layer with Lumira Demo
Glad you covered this topic, Corrie!
Very informative. Thanks for sharing.
Hi Corrie. This may seem a silly question, but how do you access the MMB views you highlight as they do not live in our failed data repositories
Many thanks
These are a part of the main Information Steward repository, not the failed data repository.
Creating custom reports from the FDR is ok for getting at the details of failed records.
However, a real pain in that you cannot recreate the Data Insight scores themselves.
It's ok if none of your rules have filters on them.
Yet for filtered rules, Data Insight uses the filtered rowcount for score calculation, and that value is not stored in the FDR.
Hi Corrie,
Thanks for sharing this information.
It seems that de failed data table (<alias>FD_table) only contains the rows of the last run. Correct? For auditing purposes I would like to keep the failed rows of previous runs in the database as well. Is this possible?
thanks
Gijsbert
There is an option in CMC
Hi Guys,
Can anyone help me with IS_REPO backend tables name which hold the failed record of the rule?
Thanks