Technical Articles
Defragmentation of Column-Organized Tables with Db2 11.5.4
Introduction
When you run an SAP Business Warehouse system on IBM Db2 and IBM Db2 BLU Acceleration, it might happen that the column-organized tables in the SAP BW system get fragmented. With Db2 11.5.4, you get new tools for analyzing the level of fragmentation and for defragmentation of column-organized tables. In this blog post, I will explain when you should defragment tables, which of the available tools are advisable for which occasions, and what you can expect from defragmentation.
When and Why Is Defragmentation Needed?
Column-organized tables can get fragmented for the following reasons:
- DELETE of rows
- UPDATE of rows because it is a DELETE+INSERT
You should consider defragmentation in the following situations:
- Measurable performance degradation
- Ratio between number of rows to space consumed gets worse over time
You can expect the following after defragmentation:
- More efficient bufferpool usage
- Space savings
- Probably improved performance
Note: Defragmentation is not necessarily a weekly task. We recommend that you contact SAP support before you defragment tables on a large scale.
Tools for Defragmentation
With Db2 11.5.4, there are additional tools available that you can use for defragmentation. I will describe each of them in turn.
One option that you have for analyzing fragmentation and for defragmentation is using the following two stored procedures, which have been delivered with IBM Db2 11.5.4:
- SAPTOOLS.ADMIN_COL_ANALYZE_FRAGMENTATION
- SAPTOOLS.ADMIN_COL_DEFRAGMENT_TABLE
You can use the first stored procedure to analyze fragmentation and the second one for defragmentation. To check the level of fragmentation, a term called density ratio has been defined. The density ratio is the percentage of valid logical rows in a block of continuous 64K logical rows that includes already deleted rows. Or, to put it simply: It is the percentage of space still occupied in a 64K-row block. The higher the density ratio, the more data will need to be moved to defragment a column-organized table.
For a given table, ADMIN_COL_ANALYZE_FRAGMENTATION returns, beside others, two significant values for ten different density ratios between 0.1 and 1.0 in steps of 0.1:
- Estimated storage freed in MB
- Estimated rows that need to be moved
As an SAP BW administrator, it’s up to you to find a good compromise between freeing up space on the one hand and effort on the other to move rows to the end of the table. Moving rows means that an SQL UPDATE is performed for each row matching the density ratio criteria. This decision is not easy and therefore a new SAP program called RSDB6_COLTAB_DEFRAG has been developed to help in this case. The program is based on the two stored procedures but offers an easier-to-use UI. Its main features are the identification of possible candidates for defragmentation and a heuristic to determine a good density ratio for defragmentation.
To get program RSDB6_COLTAB_DEFRAG, implement SAP Note 2944714 or the support packages in this SAP Note. The SAP Note also contains instructions how to use program RSDB6_COLTAB_DEFRAG. In addition, you’ll find more details about the parameters and usage of the stored procedures.
Using Program RSDB6_COLTAB_DEFRAG
Start the program using transaction SE38. You can select ANALYZE or DEFRAG. For ANALYZE, you can check CANDONLY (for display candidates only) and you can check RUNSTATS. RUNSTATS is helpful if you have just defragmented the table and now want to check the remaining fragmentation by analyzing the table again or if you changed data within the last two hours. For DEFRAG, you can enter the density ratio suggested by ANALYZE.
The following screenshot shows the output of ANALYZE with CANDONLY. The green row indicates the suggested density ratio for the table.
As you can see, analysis is fast; we analyzed 4109 tables in about 10 minutes. One table has been found as a candidate for defragmentation with a suggested density ratio of 0.2.
Choosing the Right Tool for Defragmentation: Program DB6CONV vs. RSDB6_COLTAB_DEFRAG
Remember that there’s another tool that you can use for defragmentation, that’s the program DB6CONV. Prior to Db2 11.5.4, that was in fact the only way to defragment tables. The following table shows differences between the program RSDB6_COLTAB_DEFRAG on the one hand and the program DB6CONV on the other regarding column-organized tables:
Program DB6CONV | Program RSDB6_COLTAB_DEFRAG | |
Is based on | ADMIN_MOVE_TABLE | ADMIN_COL_DEFRAGMENT_TABLE |
Access to source table for other transactions | Read only | Read/write |
Operation | SELECT from source and INSERT into target table | UPDATE on table to defragment |
Affected rows | Copies entire table | Updates a subset of the rows depending on chosen density ratio |
Runtime | Fast | Relatively slow |
Resulting table size | As small as possible | It depends on chosen density ratio |
When to use | Whole table is fragmented | Only same parts of table are sparsely filled |
As you can see above, DB6CONV allows read access only, which limits the availability of the table to SAP applications. However, DB6CONV is fast and results in an optimal small table size.
Program RSDB6_COLTAB_DEFRAG allows full access to the table, but adds extra UPDATE workload to it. Compared to DB6CONV, RSDB6_COLTAB_DEFRAG is slower, and the resulting table size is bigger. Due to this behavior, program RSDB6_COLTAB_DEFRAG only lists tables as candidates for defragmentation up to a density ratio lower or equal 0.4.
Therefore, when you need to consider higher density ratios, you should consider using program DB6CONV.
Conclusion
The new stored procedures in combination with the ABAP program RSDB6_COLTAB_DEFRAG enrich the options for SAP administrators to deal with fragmentation of column-organized tables. Depending on the density ratio required to defragment your tables, these tools are an alternative to using DB6CONV.
If you use this new functionality, I would be very interested in hearing your feedback. Program SDB6_COLTAB_DEFRAG contains a heuristic to suggest candidates and a density ratio for defragmentation. Depending on the customer feedback that we get, we might want to adapt this heuristic in future versions of the program.