EY Asset Data Intelligence: Using AI to help identify data quality issues leveraging the power of the SAP Business Technology Platform
Before diving into the post, a special shout-out to Lesley Elliot for her leading role and to the team James Blackmore, Krupa Dodhia, Munthir Wagieallah, Joanna P Santos and Nuno Barata for their part in making this happen.
SAP Data Intelligence Business Content
The SAP Data Intelligence product team recently conducted a Packathon to support SAP partners in the development of re-usable content for SAP Data Intelligence to address specific business use-cases. You can learn more about the content sprint in this blog and review example business cases at http://www.sap.com/dataintelligenceusecases.
EY AgilityWorks participated in the Packathon to develop a content offering – EY Asset Data Intelligence – that aims to help organizations improve the quality of their asset data. This purpose of this blog is to share what we achieved with SAP Data Intelligence and to give an overview of our solution.
What is Asset Data Intelligence?
Asset Data Intelligence is a solution developed by EY AgilityWorks, built on the SAP Business Technology Platform (BTP), that aims to automatically identify data quality issues in asset data. The premise of this solution is to automate detection of data anomalies through the use of artificial intelligence (AI) and integrate the results into business workflows. The intended outcomes are to accelerate time-to-value over typical rule-based approaches, to support reactive and pro-active data governance, and to drive new insights through the power of advanced data processing and AI. The primary beneficiaries of this solution are organizations that use the Plant Maintenance module in SAP S/4 HANA or ERP to store their asset master data. Through a flexible data processing framework and the capabilities of SAP Data Intelligence, the solution can be extended to other data domains in SAP and non-SAP sources.
Who can benefit?
This solution is primarily aimed at organizations that:
- Store large amounts of asset data in SAP or other enterprise systems
- Aim to improve the quality of their asset master data
- Wish to accelerate insights into their asset data quality
- Want to enhance the governance and control over their enterprise data
- Would like to explore the capabilities of SAP Data Intelligence, HANA Cloud and the SAP BTP
How does it work?
The solution runs a four-step process:
- Extract data from the SAP asset hierarchy and classification data tables into SAP HANA Cloud.
- Transform source data to form the input structures required for analyses in subsequent processing steps. A key element here is to profile data in the right business context. This is achieved by partitioning the data prior to processing to optimize the results returned.
|o Identify Relationships: Determine distinct groups of related data characteristics so that the follow-on outlier detection algorithms can be performed on more focused views of the data. For example, identify width, height, length and volume are a set of related attributes and then perform outlier detection on them specifically. To achieve this, a series of statistical methods are run across the data and a mapping of all relationships is produced to find distinct “feature sets” (groups of related data attributes).|
|o Identify Outliers: Mine data to identify data anomalies and quality issues, across a range of data types and patterns. To achieve this, outlier detection techniques are applied to each feature set of each asset class. The outlier detection techniques applied are selected dynamically based on the dimensionality and cardinality of the data.|
|o Explanation:Provide supporting information to the results of the outlier detection to allow end-users to better understand the results of the process, and to take more informed decisions. To achieve this, two steps are taken. First, a score is provided for each record to quantify the likelihood it represents a data anomaly. Second, model explanation techniques are applied to each outlier to identify which attributes were significant drivers in the model’s conclusion.|
- Publish the input data, identified feature sets and results of the outlier detection algorithm as HANA views and ODATA services on the SAP BTP. This makes them accessible to a range of use-cases such as application development or consumption via analytics tools such as SAP Analytics Cloud, direct integration with SAP ERP or S/4 HANA, or follow-on processing in further data pipelines in SAP Data Intelligence.
The core components of Asset Data Intelligence are:
- SAP Data Intelligence data pipelines that orchestrate the end-to-end Source->Relate->Analyze->Present Data flow
- SAP HANA Cloud data models to store the source information and the resultant outliers
- SAP HANA views to present the outliers to consuming systems such as SAP Analytics Cloud
The illustrative architecture for Asset Data Intelligence is:
The content package produced as part of the recent SAP Data Intelligence Partner Content Sprint includes the pipelines (including AI algorithms), database artefacts, and application development components to get started including analytical views in HANA Cloud for consumption in SAP Analytics Cloud, and ODATA service definitions for publishing the data through SAP BTP. NB: Content for SAP Analytics Cloud or extensions of SAP S/4 HANA or ERP are not included.
What is your experience with asset data intelligence? Please share with us your insights.
You can also reach out to the author of this post directly, or via our website EY AgilityWorks.
The views reflected in this article are the views of the author and do not necessarily reflect the views of the global EY organization or its member firms.