Information Steward 4.2 in Practice: How SAP’s Data Management Organization Uses Information Steward
This blog highlight key points from the ASUG webinar delivered by Michael Palmer (firstname.lastname@example.org). SAP recently went through the process of acquiring Ariba and Success Factors. SAP’s Global Data Management Organization (GDMO) was tasked with assessing the quality of Success Factors’ data and integrating into SAP’s landscape. SAP’s Global Data Management Organization (GDMO) selected Information Steward to accomplish this task, how they built build dashboards and scorecards, based on reusable business validation rules to monitor and assess data quality overtime, and how Information Steward accelerated the time to complete the integration of the Success Factors’ systems and data. In addition, see how SAP uses Information Steward to ensure quality in CRM and other systems.
To check out the recording and the slides (ASUG members only), go to the Data Governance
The goal of SAP’s data management program is really centered around Customer data. The goals follow:
- Analyze data quality of Merger and acquisition data priori to merging into the SAP Data Domains
- Assess data quality impact and drive cleansing
- Identify top Data quality issues as measured against SAP Global Data Standards
- Support fast analysis cycles with low Total Cost of Ownership
- Support flexible business control of rapidly changing data quality rules
To do this, the team uses SAP Information Steward. They also establish standards and processes, accountability for data ownership, and accessibility of the key data domains.
The existing database has both marketing and customers. And, of course, before matching the data, it needed to be cleansed. Multiple passes of the analysis was conducted—this was not just one cycle.
The team chose SAP Information Steward for these reasons:
- Rapid and flexible Data Quality engine: quick and flexible analysis of large volumes of data for profiling and exploration
- Validation rules and scoring: flexible control of data quality rules, scoring and analysis. Rapid rule creation, quality assurance, and solution deployment.
- Turnkey solution for trending and scoring: Ability to rescore data and produce trending results over time (monthly, weekly, etc.)
- Data Quality scorecard: Ability to deploy aggregated domain scores and scorecards by metric, quality dimensions, etc. Easy to share with Data leads, data managers, and business community.
- Integration and Template for future: Ability to integrate results with BI solutions. Reusable validation rules for multiple elements and domains.
- Low TCO: Low TCO compared to a customized BI solution. Fast time to realization with low resource needs. Someone who is an analyst can quickly come up to speed.
The Failed Data Mart is a nice integration point for other solutions, too. Notice the integration with Webi.
Keep in mind that it takes a lot longer to agree on the standards and rules within your company than it does to implement them in the tool.
SAP attacks multiple data domains. Each Information Steward project attaches to a domain, and then users are only given access to the domains relevant to them. As you add and authorize users, you can ensure that they can only view data that they have access to.
Step 1: Initial profiling
After a project is created, the administrator connects the project to specific sources.
After data is loaded, do initial level of profiling. This profiling helps you determine where you want to write business rules.
SAP found this basic profiling extremely helpful. For example, look at the percentage of nulls in the Country field. Why is there 5.3% null?
Step 2: Create rules
In the past, perhaps these rules would have been written in Data Services. But Information Steward has a simple editor with built-in functions.
The categories are used to guide them and help understand how to roll up the results into a scorecard. Keep in mind that the rules can be written once and then attached to multiple data elements and data sets. Huge time saver, and ensures consistency across domains.
Step 3: Rule approval
Make sure that rules are approved by the Line of Business. After it’s approved, then it can be reused.
For example, write a rule that the length is 1 or greater, and also check for dummy/default values. This can be reused. Another example is that email would conform to the proper format, and also that dummy email accounts aren’t used. Could also check this against a lookup table.
Complex example next is an integrity check, that there was a PreSales contact at the Partner organization. This is the most complex type of rule.
Step 4: Bind the rules and view the results
Bind the rules to specific data sources, run the results, and then see the results.
This is the view that the end user would see. It is not a BI tool, but very helpful to management that they could get a single score for a domain.
You can control the coloring, and the weighting of the rules. Right now, simple weighting is used, but in the future they anticipate extending this. In this case, the Vendor Emails are really tanking the score.
To check out the demo, ASUG members can view
the recording of the webcast.
In a few minutes, 270,000 records are analyzed for basic profiling. Again, the time consuming part is establishing the rules and getting them approved, not writing the rules in Information Steward.
For the end users, the group shares a URL to a specific scorecard, so the users don’t have to spend any time navigating the tool.
We needed to build quickly to support the Mergers & Acquisition process.
- Rules repository
- Flexible access to data
- Turnkey solution
- Tower TCO
- Template for future
- Visibility and actionable
Understand that your organization has to be *ready* and mature enough to act on the results that are shared. There needs to be an understanding of standards, rules, and access. Always talking in business terms instead of technical terms definitely helps.
Related blogs on SAP’s Data Governance program: