Using MDG Rule Mining to Improve Data Quality
Define data quality rules can be a time-consuming task. It requires multiple mails, meetings, and phone calls between your business teams and master data teams.
Since SAP Master Data Governance on S/4HANA 1909 and on S/4HANA Cloud 2008, Master Data Rule Mining supports business users and master data experts in analyzing their master data for new data quality rules. Machine learning technology finds patterns in the master data and proposes new rules based on these patterns. You can review and accept these rules which you can then automatically integrate into your master data process.
The rule mining solution helps you to significantly reduce the cost to set up rules, because the proposed rules are based on the “as-it-is state” and existing facts within your master data. The integration with the rule repository and the assistance during the transfer to an active rule diminishes the risk of setting up incorrect rules, reduces the time to implementation, and requires less technical expertise. Also, the data correction effort required by implementing these rules is measurable, because the data evaluation of the proposed rules is visible before you accept them as master data quality rules.
This blog explains when and where you can use the rule mining tool.
When and where to use MDG Rule Mining
To better understand when and where the rule mining tool can be used, let’s look at a couple of typical customer stories.
Story 1: We have a Business Problem
Ralph is the MRP Controller for ASAP Inc. – He is sick of constantly recurring issues in material availability. These issues are caused by MRP parameters that are not maintained as they should be. Up to now, only “paper-based” rules exist. They are outdated, extended with post-its, and not monitored or enforced by the planning system.
Ralph calls Lisa, the master data steward and invites her a meeting, to see how they can work together to improve the situation in the system.
In the meeting, Lisa suggests setting up business rules in the system to make sure the data is correctly maintained. Before defining any rule, Lisa suggests using Master Data Rule Mining to find new proposed rules to serve as a basis for their new data quality initiative. They open the Master Data Rule Mining tool together…
Story 2: Data Quality Workshop
Following a reorganization, ASAP Inc. has a new Enterprise Data Management organization focused on assisting the company in its digital transformation. The Enterprise Data Management organization knows that in order to make digital transformation successful, data quality is a top priority, and decides to invest in their master data.
The managers organize an event to bring different experts from across the company (line of business, IT, data experts) together to define steps for improving their master data quality. Because ASAP Inc don’t have a process or rules to maintain their product master data and also lack the ability to monitor their data health, they choose product master data as the focus for their workshop.
How to use MDG Rule Mining
MDG rule mining can play a big role in the above scenarios. This chapter describes the rule mining concepts and processes.
A complete rule mining process consists of 4 steps:
- Create a Mining Run
- Start a Mining Run
- Find and Accept Mined Rules
- Implement Accepted Rules (automated) in the Rule Repository
To better learn how to use the tool let’s look at the scenario described in Story 1 above. In this case we’re working on how to define an MRP group for finished goods. The rules may involve some basic product data such as Material Group, Base Unit of Measure, and MRP Type.
Step 1:Create Mining Run
You trigger the rule mining process by executing a mining run. A mining run tells the system the data you want to focus on when proposing new data quality rules. To create a mining run, open the Manage Rule Mining Run for Products app and choose the + button.
Here is explanation of the fields you will see on the user interface: :
Description: A sentence or key words outlining what you want this mining run to do. This can be useful when identifying a mining run later.
Goal: A detailed explanation of this mining run’s purpose, or the rules you expect from the data.
Tables: A list of tables to be mined at the same time. Under each table, you define the focus area and fields you want to use for the mining run.
Focus Area: The data set you want to use for mining. For example, in this case we want to perform product master rule mining and choose Product Type = Finished Goods (FERT). These areas are carried to the mined rules later.
Fields: A further drilldown of the selected focus areas on field level. The system examines the values of the selected fields to find potential rules.
Checked by Rule: Check this flag if you want this field to be checked as part of a rule. Mined rules are formatted as IF/THEN statements, selecting this flag means that this field will be in the THEN part of the rule.
Condition of rule: Check this flag if you want the field to be a condition of a rule. Mined rules are formatted as IF/THEN statements, selecting this flag means that this field will be in the IF part of the rule.
Maximum number of rules: The maximum number of rules you want to get from this mining run, this defaults to 100 rules.
Example: Defining a mining run to discover rules for the MRP Group field
- In Tables section choose +. Select the tables Basic Data (MARA) and Plant Data (MARC) from the popup.
- The details page for the table Basic Data (MARA) displays on the right panel of your screen.
- In the Focus Areas section, under Filters choose + and select Material Type (MTART) on the popup and use the value help to select value FERT after the popup closes.
- In the Fields section choose +, select Material Group and Base Unit of Measure. After the popup closes, the flag Conditions of Rule is checked by default.
- Go back to the Tables section and choose Plant Data (MARC). The details page for the table Plant Data (MARC) displays on the right panel of your screen
- In the Focus Area section, there is a default entry, Plant (WERKS) generated already, use the value help to select a plant, for example, 0001.
- In the Fields section choose +, select the MRP Type. The flag Conditions of Rule is checked by default
- In the Fields section choose +, select the MRP Group, deselect Conditions of Rule, and select Checked by Rule.
- Review your data and Save.
Step 2: Start Mining Run
Once the mining run is saved successfully, you should get a numeric mining run ID, which is visible in the mining run list.
Choose the Start button on the bottom of your saved mining run. You will get a confirmation popup which tells you how many records are selected by your mining run definition, and the mining is triggered once you confirm it on the popup.
The mining run begins running. You can stop the mining run while it is still running if you wish and make changes to your settings.
You can choose Refresh to check if the mining run is finished. This updates your mining run status and delivers the numbers of total proposed rules on the mining run header.
Step 3: Find and Accept Mined Rules
To open the mined rules click on the Total Rules value in mining run header. You then get a list of proposed data quality rules from the system. It is important to note here that these are only proposed rules. You need to review them (maybe together with other experts in your organization), accept them if they make logical sense in the business process. In the end you need to create data quality rules out of them if you want to use them in your master data process. Here is a summary of the fields and their meanings:
ID: Identifier of each mined rule, numeric, and generated when mining run is completed.
Description: Readable text explaining the rule. It is displayed as pattern: IF … AND … THEN …
Example: IF Base Unit of Measure = Days AND Material Type = Service
THEN MRP Type = Time-phased Planning
Technical Description: This explains what this mined rule is in technical terms. It is displayed as pattern: IF … AND … THEN …
Example: IF MARA- MEINS = DAY AND MARA-MTART = DIEN THEN MARC-DISMM = R1
Focus Area: The data sets where this mined rule applies, inherited from on the focus area you chose when creating the mining run.
Technical Focus Area: The data sets where this mined rule applies for in technical terms
Data EVALUATION: The evaluation result of the mined rule on the data of selected focus area
Complies with Rule / Complies with Rule (%): Numbers/percentage of records from the mining run’s focus area that obey this rule
Violates Rule / Violates Rule (%): Numbers/percentage of records from the mining run’s focus area that violate this rule
Not Relevant / Not Relevant (%): Numbers/percentage of records from the mining run’s focus area that are not relevant to this rule, meaning they don’t meet the rule conditions.
Checked field: The field name which the mined rule checks, and it is passed into the final data quality rule later when it is linked.
Status: An indicator of the decision made regarding this proposed rule. Newly proposed rules have a status of Initial, and you can change them to Approved, Rejected, or In Review.
Linked data quality rule: The linked Data Quality Rule, which you create based on a proposed rule or linked manually to an existing data quality rule.
When you find meaningful rules, accept the rules first, and then you can choose the Link dropdown button to link the mined rule to either an existing data quality rule or a new data quality rule. The linked rule is shown in the Linked Data Quality Rule column.
You can also put several mined rules together to create one data quality rule.
Step 4: Implement Accepted Rules in the Rule Repository (automated)
Once you have created new data quality rules, go to the data quality rule by clicking on the linked data quality rule. You see the selected proposed rules are listed there with status Not implemented. In the related image we have three proposed rules for one data quality rule.
Data quality rule implementation is done in Business Rule Framework (BRF+). When you create a new data quality rule manually, you need to create all BRF+ objects, expressions and rules in the BRF+ workbench. But with rule mining approach, all the BRF+ implementation is done automatically by choosing the Prepare button on the Usage screen.
After choosing Prepare, the status of all proposed rules changes to implemented, and the Scope and Conditions decision table is created in the BRFplus. You can choose the link to open the generated BRF+ implementation.
Before you can use the rule in the data quality evaluation process or other master data processes, you must approve the data quality rule and enable the usage.
Rule Mining finds patterns in master data by looking at the available combinations of attributes and values. The system outputs combinations of attributes and values that fit certain criteria as proposed rules. Based on business know-how and data evaluation of these combinations, end users decide if these proposed rules qualify as real business rules. Afterwards, the accepted rules can be automatically implemented as data quality rules, which can be used in your master data process.
Next for You
Mine Meaningful Business Rules from Your Classification Data in Product or Business Partner Master Data
very useful outline of this new feature of SAP Master Data Governance on SAP S/4HANA 1909!
Thanks for the Interesting document. I have one question. Can this rule mining also be extended for MDG Custom Objects as well or we have dedicated app's for Product / BP and FI ??
Yes. The rule mingling app is generic. It Can be used as well in Custom object if you built the model in MDG Consolidation customizing
From my point of view, this is a very useful feature. A question is if a additional SAP Leonardo license for the ML capabilities is needed...
No SAP Leonardo license is needed. This is based on HANA embeded Machine Learning technology.
Hi Kefeng Wang,
Thanks for sharing a nice blog ! I have few questions can your please find some time to answer ?
Yes, Rule Mining is based on MDG Consolidation model. It look at the data directly from active area. If the data available in S/4 system, you can build the flex custom object in MDG Consolidation model, then it is possible to use rule mining functionality.
It suppose that you can run the mining directly in the production system. The rules you get out of rule mining, you can apply them in the data in the system back.
Thanks for clarifying my questions. Appreciate If you can help on below as well 🙂
In priciple yes, if you have the database table filled in 1909 system, and if you can manage it as custom object in MDG Consolidation, then it would be possible to use rule mining.
The comment function is not available yet, if this is highly needed, we could put it into our roadmap.
Thanks for sharing very useful MDG concept !!
Overall, as an example for an article master creation in a business process , there are certain rules/validation need to set up whether it can be configuration or custom development, to achieve those rules developer writes certain set of business logic, the same has been replicated and defined as a ML rule with combination of experience and operational data. Is my understanding correct ?
MDG Rule Mining propose the rules without knowing the the business logic or configuration, it looks at the current state of master data in the system. The proposed rule can be used to valid the logic which you want to introduce into system. This is actually Story 3 🙂
I created a data quality rule and mining run but it failed with error. How to analyze the root cause or debug this error. Please help
I had the same issue. It's now OK, I've solved it with my Basis Team : they installed PAL on the HANA Database and gave the User on database the right autorization : https://help.sap.com/viewer/6d52de87aa0d4fb6a90924720a5b0549/1909.001/en-US/599930b777474da9973c9cf3280bc81d.html
when this issue still happens by following Marion's hints, you can also check the Script Server running, for more details, check SAP Note 1650957 https://launchpad.support.sap.com/#/notes/1650957
Thanks, it was resolved by Marion's hint.
However, the evaluation process is still not running.
I did below two configs for evalutaion in MDQ apps.
bc set for evaluation activated
Evaluation stuck at :
evaluation audit trial
I have created one Data Mining Rule and linked to a Data Quality Rule.
I entered to Data Quality Rule and clicked on PREPARE option under usage, Expressions are created, when I click on the Scope and Condition Expression links, it is redirecting to BRF plus workbench, but I don't see any decision table created (it is just showing a blank screen).
This process should be done in the development client and should be saved in a TR?
THANKS IN ADVANCE.
hi Teja, if you still have issue, feel free to send me an email, email@example.com to check the details together
This blog is really informative.
I created mining rule in our S4 system. It is failing with message "Algorithm execution failed for mining run".
In the browser following error message can be seen in debugging:
abap.js:64 Assertion failed: could not find any translatable text for key 'Selected Tables' in bundle '../../../../sap/md_qrlmng_s1/~A4FA8DFFC69C0F21CE5AE7CB7A457AF9~5/i18n/i18n.properties'
There is OSS note 2950938 associated with this issue. But I could not find any information further.
Do we need to raise to OSS message for this one?
Appreciate your help.
sorry for replying late, I hope you haved the problem. If not, please feel free to create OSS ticket on CA-MDG-ADQ
Thanks for this great article!
May I know if this function is available in MDG Cloud Edition? Is there any way for SAP ECC 6 to leverage this? Thanks to advise!
This function is not available yet in MDG Cloud Edition. If this is demanded by customer, we would consider to enable it also in MDG Cloud in future. If you have any refering customer case, feel free to reach out to me.
Thank you for a very informative blog.
I would like to know if the rule mining also supports the automatic implementation of Derivation rules from the accepted rules similar to the validation rules in MDG on S/4 HANA 2022 version ?
Like we have an option to link to an existing or create a new validation rule and then using the Prepare option the implementation automatically generated.
Similar do we have anything to for Derivation Rules ?
Is it possible to create Derivation rules based on mined rules ?
Thanks for comment. It is not available yet, but also very interesting point for us.
If you like, feel free to drop me a mail firstname.lastname@example.org, maybe we could have a nice discussion about this
In case you are running into error of 'Algorithm execution failed....
and see more details regarding AFL installation from Mishra Ashish post: Algorithm execution failed for mining run in Rule Mining App | SAP Community