Data Quality Reporting on SAP Information Steward
The Backend database can be configured while installing Information Steward or later also but the model used by SAP Information steward will remain same and independent of the what database type is been used. The idea is to write queries in such a way to explore every aspect of failed data and do efficient and customized reporting on it. For this paper I am assuming the backend database used is Microsoft SQL Server and all the queries will be written in T-SQL. Which can be customized according to the database used as a backend of SAP Information Steward.
Data Model SAP
Information Steward creates an alias of each table it has imported in the data model [IS_FD_TABLES_MAP] table. This mapping table contains the mapping of external table FQ_TABLE_NAME with the internal table name alias TABLE_ALIAS in Information Steward. The table also contains the external connection name and Information Steward Project Name.
Whenever a task is created in the SAP Information Steward also stores the information about the task, every time a task is executed successfully and an incremental RUN_ID is created. All this information is stored in [IS_FD_RUN_HISTORY].
The rules which is created in SAP Information Steward are also stored in the backend with limited information in [IS_FAILED_RULES_INFO].
Every execution of the task in Information Steward will not create table for failed data, only if you select the backend database to store the data in the SAP Information steward, full scale data will be inserted in the Information Steward data model. This data is stored in two types of tables for each source. One is the FR – Failed Rules tables and other is FD – Failed data tables. Information steward creates a surrogate key in FD type tables called IS_GEN_ROWID, which is a unique identifier for failed records in the Information steward data model. The mapping of this failed records with the Rules is maintained in the FR type table. Name of both the FR and FD type table in the database is determined based on the Table Alias name created in Information Steward.
A detailed data model used in SAP Information Steward backend is shown in below
This is the base data model which is used to store all the metadata of the failed rule’s table. The metadata of all the failed rules table is shown below.
SAP Information Steward also provide a very detailed analysis of the failed data by giving an option to create the scorecard in the tool itself. But in case the business users have some specific requirements to get the failed data analysis then this data model will enable users to create customized reporting on the tool, all the features shown in SAP Information Steward can be recreated and shown by using this data model plus a developer can customize the reports or the dashboard according to the business requirements.
Any reporting tool which can connect to the database of Information Steward can be used to create a customized scorecard. A sample query to create a summary of failed rules failing on one source table can be written as
Select a2.*, a3.TOTAL_ROWS, a1.failCount, a3.TOTAL_ROWS - a1.failCount as passCount From( Select Run_ID,RULE_ID,count(distinct is_gen_rowid) as failCount -- Replace this <Table_Alias_Name> with the TABLE_ALIAS column value From <Table_Alias_Name>_FR Group by Run_ID,RULE_ID ) as a1 Inner Join IS_FD_RULES_INFO as a2 On (a1.RULE_ID= a2.RULE_ID and a1.RUN_ID=a2.RUN_ID) Inner Join IS_FD_RUN_HISTORY as a3 On (a3.RUN_ID = a2.RUN_ID);
In the above query the <Table_Alias_Name> should be replaced by the actual TABLE_ALIAS column value found in the Mapping table [IS_FD_TABLES_MAP].
In case requirement is in such a way that some additional information is needed by the business users to be shown in the scorecard. Additional tables can be created to store that information in the database (can be same or different). These tables can be useful to prepare heavy reports by aggregating the tables or even show a trend of the failed data by maintaining the history of the data in a separate table.
To create a customized reporting on the SAP Information Steward can be done using the underlying data model which can support adding additional custom features in the Scorecard for the failed data.