The general vision for any organisation is to establish a “trusted information” environment which means high data quality and accurate information, reconciled where there is a risk or requirement. Trusted and relevant information means high data quality and relevance; one version of the truth and accurate information.
Data quality and reconciliation then becomes a key concern especially when integrating data from different source systems to a central Enterprise Data Warehouse. It is thus critical to ensure that the source data is of high quality and the extraction transformation and loading (ETL) process is fail proof before data is loaded into the target system. Back to reality, it’s seldom to have such a perfect scenario and there is no guarantee that the ETL process always brings accurate data.
Here are some pain points of what you may have experienced:
- During initial extraction, transformation and loading (ETL) development the data loading processes are thoroughly tested and data is reconciled, but this is a once off exercise and no processes are in place to continuously guarantee data quality
- The responsibility of data quality is left with the business by identifying large variances in reporting results
- Business loses trust in data quality
- IT has no visibility of data quality and understanding of data subject matter
Experiencing deja vu 😯 ? To address these pain points we need to break them into two categories ie Data Quality and Data Reconciliation.
From a data quality strategy point of view the general recommendation is to have well defined data quality processes and to establish toolsets that deliver the following outcomes:
- Data quality dashboards that help data owners and IT support in assessing measurement for data quality and applying simple data cleansing and validation rules
- Standardise, parse, cleanse, enhance, match, and geocode data (depending on the business requirement) to help improve and safeguard data quality
- Inconsistencies in data between systems are quickly identified and notified
- Data reconciliation process formalisation – confidence that reconciliations are performed as scheduled or ad-hoc
- Transparency and visibility of data reconciliation processes and the results thereof
- Enhanced overall business user confidence – decisions can be made with full confidence in the underlying system data
Data Quality – Key Approach
It is recommended to setup an Operation Change Management (OCM) initiative to identify and integrate data stewards into the data governance framework of your enterprise. This helps data stewards to be empowered to take action to improve the data quality in their respective data domains.
Some of the key responsibilities of data stewards should be to:
- Control data quality activities by defining data quality metrics, validation rules, perform data validation and monitoring by utilising data profiling tools and processes
- Empower data governance, policies or standards for ensuring appropriate data entry, usage and transformation
- Provide the business with appropriate communication of issues with delivery or quality of source data
- Coordinate and collaborate with other data stewards for enterprise data governance needs
Data Quality – Key Challenges
- Establishing a culture where data quality is a priority
- Implementing policies and data quality standards across the enterprise
- Mentoring staff on best practices for data handling and data quality
- Resolving decisions, righting problems and disputes within agreed SLAs
- Toolset which provides insight into data to review and analyse data quality scorecards
Data Quality – Toolset
A data quality profiling and validation toolset (such as SAP Information Steward) for data stewards, business users and IT support teams is a way to measure and improve data quality across the enterprise. This will aid users to get insights into their data through dashboards, analyse business rules and improve the overall quality of the data.
(Figure shows a Data Quality Scorecard – SAP reference image)
The approach to establishing trusted data would be to reconcile metrics thus ensuring the consistency of data that has been moved between source and target (BI) system. Data reconciliation for a data source metric allows for checking the integrity of the loaded data into BI, for example, comparing the totals of a metric in BI with the corresponding totals against the source system.
In general, it is recommended that a well defined reconciliation process and toolset be established that delivers the following outcomes:
- Inconsistencies in data between systems are quickly identified and notified
- Reconciliation process formalised and automated – confidence that reconciliations are performed consistently and frequently
- Transparency and visibility of reconciliation processes and the results thereof
- Enhanced overall user confidence – decisions can be made with full confidence in the underlying data
Candidates for data reconciliation would be based on the factors below:
- Risk of data quality:
- Complexity of extraction processes (eg extraction via delta queue vs full load)
- Regular human intervention required
- Number of modifications to standard extraction processes
- Loading process (delta loads vs full loads)
- Frequency of change to business rules and validations
- Ability to execute reconciliation:
- Level of customisation and complexity of data transformation
- Data volumes
- Business impact of inaccurate data
The ranking of the above factors would be evaluated against a score of low, medium and high.
Assessment questions should be provided to evaluate the risk factors for each metric during the requirements gathering workshops. If the result of the assessment for a metric is above the threshold level (medium) in all three areas then the metric is recommended as scope for reconciliation. The scope may be enhanced by IT Support team if they experience reconciliation issues in problem areas.
Data Reconciliation – Key Approach
Signal data reconciliation issues
- When a data reconciliation issue is identified, a notification should be sent to the IT support team to execute necessary actions to address the issue and to nominate a functional owner (eg data owner, business owner etc) to inform about inconsistencies in the reporting data
Frequency of data reconciliation:
- One time: Should be able to perform ad-hoc reconciliations as and when required (eg testing, cut-over planning, pre-go live etc)
- Scheduled: Should be able to perform scheduled reconciliation (eg daily, weekly, month end etc)
Level of data to reconcile:
- Data reconciliation can be system resource intensive if appropriate data filter restrictions and the level of data aggregation are not considered. It is advised to follow the steps below to reconcile data at different levels:
Step 1 – Reconcile data at the highest level of data aggregation (eg company code level etc)
Step 2 – If step 1 does not provide the required information, then reconcile data at medium level aggregation (eg business unit etc) with additional selection criteria
Step 3 – If step 2 does not provide the required information, then reconcile data at low-level aggregation (eg document etc) with additional selection criteria
Data Reconciliation – Key Challenges
Implementation effort could depend on the complexity of the data, number of source systems and the number of metrics to reconcile. It is thus advised to prioritise the metrics based on business requirements.
Processing time largely depends on the volume of data that has to be read by the database and transferred; the prerequisite for using this process is that users have set meaningful filter selections in order to keep the volume of data that is to be transferred as small as possible.
Data quality at the source system is also a contributing factor for data reconciliation. If the quality of data is poor then data reconciliation will not add value.
Maintainability, supportability and developing reconciliations to problem areas are some key matters that should be address by the IT support team.
Data Reconciliation – Toolset
Key drivers for selecting an enterprise data reconciliation tool (such as SAP certified iRekon – Data Accuracy and Reconciliation) should be based on the above key approach, challenges and deliver the following features:
- Single reconciliation system catering to enterprise-wide reconciliation needs
- Users should be able to easily configure simple to complex reconciliation scenarios
- Reduced cycle time in handling complex business scenarios
- Process high transaction volumes in a relatively short time
- Improve operational efficiency – the reconciliation tool should allow operational users to efficiently monitor, track and investigate in the reconciliation process across multiple sources
(Above figure are screen captures of iRekon – Data Accuracy and Reconciliation Dashboards)
In all, addressing both Data Quality and Data Reconciliation as above should enable the enterprise to achieve best in class data quality and most importantly improve business user’s confidence in data integrity and accuracy.
ℹ For more information about Data Quality and Data Reconciliation solutions you can get in touch at http://goo.gl/TWrEGK