Improve Performance of Data Propagation into InfoCubes with an Additional Index on SID Tables
In the past few years, SAP Business Warehouse (SAP BW) systems running on IBM Db2 for Linux, UNIX, and Windows have profited from a number of improvements. These improvements include column-organized tables for SAP BW to speed up query processing (also referred to as BLU Acceleration). In addition, ETL performance on column-organized tables was also improved, for example, with Db2 parallel insert into column-organized tables.
But this is not the end of the story; additional optimizations are possible. In this blog post, I would like to introduce a new optimization that speeds up the lookup of master data during data propagation into InfoCubes. Although the optimization works for both flat and star schema InfoCubes, flat InfoCubes benefit most from the optimization because master data lookup is the most expensive SQL operation during ETL processing.
It’s probably safe to say that data propagation into flat InfoCubes is already much faster than data propagation into star schema InfoCubes since star schema InfoCubes have many dimension tables that need to be maintained. Nevertheless, there’s room for improvement, and this is where additional indexes on SID tables come into play.
Data Propagation Into Flat InfoCubes
So, how does data propagation into flat InfoCubes work, and why do indexes help? Let me start with the SQL workload.
The SQL workload during data propagation into InfoCubes consists of the following operations:
- SELECT of the source data from PSA tables or from other BW InfoProviders
- SELECT of the master data in the source data to check whether the master data already exists
- INSERT of the source data into the InfoCube tables
Performance of INSERT operations was greatly improved with the introduction of parallel INSERT into column-organized tables with Db2 11.1 and vectorized INSERT with Db2 11.5. With INSERT optimization available, master data lookup using SELECT statements is now the most time-consuming part of data propagation into flat InfoCubes and covers 41% of the SQL workload.
So, what can be done about the performance of SELECT statements?
ABAP SELECT Statements for Master Data
During data propagation into InfoCubes, ABAP for-all-entries statements are used to look up master data in SID tables:
select * from /BIC/SPROD_HIE into table l_th_sid_out BYPASSING BUFFER FOR ALL ENTRIES IN l_t_sid_in where /BIC/PROD_HIE = l_t_sid_in-/BIC/PROD_HIE.
These statements are translated into SQL SELECT statements with VALUES clause:
SELECT * FROM "/BIC/SPROD_HIE", ( SELECT * FROM ( VALUES CAST ( ? AS VARCHAR(54) ), ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, …, ? ) AS FAE_TMP ( "C_0" ) GROUP BY "C_0" ) AS "t_00" WHERE "/BIC/SPROD_HIE"."/BIC/PROD_HIE" = "t_00"."C_0" WITH CS
Performance of these lookup statements could be greatly improved with an additional index on the master data SID tables. This index allows an index-only access to retrieve the data that is selected. This additional index is what is now available with SAP Note 2836763.
Sounds good? Then let me tell you about the performance benefit and how you get additional indexes on SID tables.
What’s the Performance Benefit?
The following chart shows the performance impact of the additional index on SID tables that we measured in an SAP-internal test on an SAP BW 7.50 system. The master data lookup time could be reduced by factor 3.3. Together with the optimization of the INSERT time, the overall time spent on SQL processing was reduced by factor 2.6. The overall job runtime was reduced by about 40%. The figures in your customer system might vary depending on your workload.
Note that the additional index is most beneficial for BW systems as of SAP BW 7.30 and higher. In these releases, master data lookup uses ABAP For-All-Entries statements to lookup multiple master data items with one SELECT statement. Older SAP BW releases use single-record SELECT statements for master data lookup. For these, the performance benefit is less considerable (SQL time improvement 21%, job runtime improvement 15% in SAP-internal tests).
How To Add Indexes on SID Tables
With SAP Note 2836763, you can use a new program SAP_RSDB6_SIDTAB_ADD_INDEX to create additional indexes or to drop them again (if not needed). When you run this program in the ABAP editor (transaction SE38), the following screen is shown:
Here you can see the following information:
- The last refresh data and time of the information displayed
- Whether additional indexes are enabled in the BW systems
- How many SID tables with and without correct additional index exist in the BW system
You can schedule batch jobs to create additional indexes on all SID tables if they do not exist yet. You can also drop existing additional indexes if it turns out that you do not need them. After the job to create the additional indexes has run, details about the indexes are shown on the screen (InfoObject, InfoObject Type, Table name, Table organization (row- or column-organized), and database index name).
You can use the program after you have installed the Support Packages mentioned in SAP Note 2836763 or after you have installed the correction instructions. After you have created the indexes, they will be automatically used in the next master data lookup operations. This is ensured with a hint that is added to the SELECT with VALUES SQL statements.
With this blog post I have hopefully shown you that it’s worth getting the latest correction instructions or Support Packages so that you can use additional indexes on SID tables. The indexes accelerate data load in SAP BW ETL processing. If you have any feedback don’t hesitate to leave your comments here.