BRFplus in Big Data scenarios
this is my first blog here in SCN. May I introduce myself first. I’m Daniel Ridder and I‘m at AOK Systems GmbH, Germany and project manager and lead architect in SAP development projects. In the last time my focus was on business rule management, development of HANA-ready solutions and operational analytics in context of Big Data scenarios.
In this blog I want to give you a brief look into work and experiences I made. In hope you are not getting bored I would split this blog into a few more articles. I guess the following structure would be fine:
- Introduction – What is it all about? – „A brief intro to our business we have to deal with“
- Rulesets for calculating prices in context of Big Data Scenarios – „BRFplus and our suggestions about working with big decision tables and usage of paralllelism techniques to deal with a huge amount of data“
This article deals with both topics. Further topics would be:
- Explorative search techniques to determine anomalies in invoices – how could we support the end user during his daily work?
- Operative analytics with BW in ERP
But first things first – hope being able to handle it… 🙂
Introduction – What is it all about?
The Project deals with the processing of medical invoices (ambulatory medical and dental care). Such collective invoices are being sent quarterly from the present medical association to each single statutory health insurance. Each invoice is supported by additional information about diagnostic data and medical procedures especially surgeries.
A single data delivery contains about approximately 7.000.000 invoices supported by an average 120.000.000 medical treatment. Due to that an invoice and it’s provided documentations is modeled by an 1:n relationship. Over a year an average health insurance Company deals with 28.000.000 invoices and 480.000.000 medical procedures. This huge amount of data has to be inspected and billing errors have to be detected. Additionally the billing amount of each invoice has to be calculated by the system. To determine a billing error you have to concatenate more than five billing quarters (needed to prove about overlapped bills of an insured person). For now you could imagine what the runtime is going to take.
Rulesets for calculating prices in context of Big Data scenarios
After this brief introduction and in face of the described amount of data to process we may talk about Big Data I presume.
One goal is to calculate the amount of each invoice. To do this we’ll have determine the price for every medical service. It sounds easy but we‘ll have to do it in each quarter collection for about 120.000.000 medical procedures. After that we could sum the prices and define the cost of an invoice.
The calculation of a price is not being done by a simple look up in a price list. The price of a medical service is defined by a regional contract of billing which is dealt between the health insurance and the present medical association. If a single service not provided in such a contract a general standard contract has to be taken.
So let me summarize our requirements: We have a huge set of medical data that has to be checked according standard and individual contracts that include billing rules. The management of these rules have to be transparent, flexible and operations using those rules have to be scalable and high-performing.
On our first try we chose the BRFplus – and it turned out to be the right decision 😉
With this article I want to share our experience and solutions to the problems we were faced with. In advance I want to thank Carsten Ziegler, Wolfgang Schaper and Tobias Trapp for the provided support.
Regionally billing contracts and raised problems by dealing with them
Slightly simplified, a billing contract is represented by a logical concatenation of expressions – by AND connected. This construction is perfectly represented by the use of a decision table.
Unfortunately a few problems raised at this point:
The whole billing contract for a quarter year contains about 37.000 rules if all 17 medical associations were included. So we were faced with the following problems:
- Decision tables of such a size are able to get handles by the BRFplus. But we determined the BRFplus Workbench is no longer the right tool to maintain these rules. We recognized noticeable latencies while browsing in the decision table and during the modification of a single row.
During rule execution the generated code of a decision table the rows consecutively passed line by line from top to bottom. So it can the order of rows becomes important, otherwise two problems will occur:
- If the current invoice being processed is subject to the rules of the 17th of 17 medical associations we have to validate about 34.000 rows before we reach the correct section of a decion table.
- If the order of the rows in the decision table is arbitrary, it is possible that a rule for a high frequently supported medical service will be found in the middle or even worse, at the bottom of the decision table which leads to bad processing times.
- And last but not least we mentioned an improvement in growth about the loadsize of the generated ABAP class of the BRFplus function. On a yearly basis we’ll expect a loadsize about 85MB.
Our chosen solutions
We could solve topic 2 with numerous attempts of creating a suitable maintenance of the decision table. In view of the end user it would be better to separate the quarterly contracts of billing for each medical association. So we provide for each quarter and medical association a single decision table which we could trigger by a gate expression in the corresponding ruleset. All other rules are not be validated in this case.
With this approach we could solve topic 1 by providing a new frontend for maintenance of rules. Additionally the power user is still able to use the workbench i.e. to redefine or modify the whole ruleset.
Please remark that is approach is quite usual in the BRFplus/DSM world. Since the framework contains an API it is easy to provide a alternative frontend for maintenance of rules. Moreover this you can implement additional checks and integration with other dialogs.
We solved topic 3 by analyzing the underlying data automatically and rearranging the rules. By creating an internal hit list (provided by a SQL aggregation) about the count of each medical service in the invoice collection we were able to sort the billing contracts in die ALV grid and generate the decision table in BRFplus in an optimized way.
The topic left over was the issue with the load size of the generated BRFplus class. By providing four decision tables in one ruleset we examined a load size of the generated ABAB class by approximately 5MB.
If we extrapolate this for a whole year and for all medical associations we would expect a load size round about 85 MB (5 MB x17 medical associations).
Currently the BRFplus does not provide some aesthetically convincing mechanisms to deal with it. We mentioned the separation of each decision table in a separate function which has to be set in function mode. As a result we modularize the rule system into different functions. To call the right one another BRFplus function has to be created. This function evaluates the medical association and billing quarter to choose the right function to call.
Another very nice solution to face the problem would be a function which evaluates a decision table. The decision table contains in its single result column the ability to call another function and to return a whole structure not just a single value (mention the following screenshot). Unfortunately this is currently not an option in BRFplus.
How to parallelize the calculation of Prices
The calculation of prices was settled by a single processing paradigm. The function context in BRFplus was defined by a single invoice. Our tests showed a negative performance impact when fetching data into BRFplus. Even building a function context with more than one invoice showed a negative impact. Since we are convinced the examined system behavior is related to our own special use case we would not generalise for other use cases.
For parallelizing the workload we chose the FPP in ABAP package BANK_PP_JOBCTRL. With this framework we are able to easily wide spread the workload onto several server groups. I think you could find great articles about FPP here in SCN so there is no need to deal with it in this blog.
Thanks to the support of Carsten Ziegler the following oss notes are recommended to observe if you want to deal with big decision tables or facing with performance issues:
1930741: Performance Improvements in the BRFplus Design Time
1936926: Runtime improvements for decision tables
1940360: Changes to the Application Server Buffering for BRFplus Meta Data Tables
1938697: BRFplus: Remove unnecessary data conversion
1901192: Syntax error in generated code for decision table
1980560: Performance Improvement for Table Operation Expression
And we implemented 2022076 (Issues in using reserved namespaces starting with /) to prevent that the rule execution gets into interpretation mode.
Decision tables are one of the most useful expressions in small and large scale use cases. In large use cases you can use them to code complex rules that are parts of insurance contracts. When you use them for huge decision tables don’t forget to optimize the order of decision tables and think about splitting huge rule systems into different functions since the work load of a single ABAP class is limited.
Moreover, search the OSS for notes that optimize DSM/BRFplus performance. And last but not least: use the API of BRFplus/DSM build a pleasing UI for the user who maintenance the rule system.
I hope this first article give you an impression about what we are currently working on and you got some inspiration to work with the BRFplus/DSM especially with decision tables.