Simplified Space Reclaim Using Insert Time Clustering (ITC) Tables
In this blog post, I’m going to explain the Insert Time Clustering (ITC) table type in Db2 for Linux, UNIX, and Windows (Db2 LUW) and show its major benefit: easy space reclamation.
I’ll also show you how you can convert regular tables to ITC tables in an SAP system on Db2 LUW.
ITC tables have been available since Db2 version 10.1 (see New insert time clustering tables – IBM Documentation). This blog post is in large parts identical to a former SAP article (published by my colleague Johannes Heinrich ) that was based on Db2 version 10.1 and is no longer available. I’ve taken over his test results here and have added some more links and comments on ITC features that were introduced later with Db2 version 10.5.
Space Reclaim in Regular (= Row-Based) Db2 Tables
Db2 tables can have a data and index and LOB object. In the following, I’ll concentrate on space reclamation in the data object in regular Db2 tables. For simplicity, I’ll omit data compression because it does not change the big picture.
Let’s Start from the Beginning: Standard Tablespace Layout
The column data of a data object is stored in pages of a tablespace. In a standard SAP installation, tablespaces have a page size of 16 KB. When records are inserted into a newly created table, the data pages are filled one by one in consecutive order. If a record does not fit into one page, it is inserted into the next page. This means that data that is inserted around the same time is also physically stored together, which means it is clustered according to its insert time.
Problem: Unused Space Gets Trapped in the Course of Time
Over time, existing table data may be updated or deleted by the application. If records are deleted, the empty space may be reused by forthcoming inserts into the same table as long as they fit into the gap. Delete and update operations on a table imply that, over time, the sequential order in which the records were initially inserted is lost, and more unused space appears in the table. Such unused space is trapped in the data object of a regular table and is not automatically reusable by other objects in the same tablespace. In addition, such trapped, unused space cannot be returned easily from the tablespace to the file system (using an ALTER TABLESPACE operation).
Space Reclaim Using ADMIN_MOVE_TABLE and DB6CONV
To fully reclaim space in the data object of a regular Db2 table, a data-moving REORG operation is necessary. To perform such a space-reclaiming REORG operation, the Db2 routine ADMIN_MOVE_TABLE can be used that allows you to perform such space reclaiming REORGs online. The SAP DB6CONV utility is a popular ABAP front end to the ADMIN_MOVE_TABLE routine. As an additional benefit, the ADMIN_MOVE_TABLE routine reorganizes also the index and LOB object of a table online.
It’s not mandatory to perform REORG operations for space reclaim purposes on regular Db2 tables because Db2 is able to reuse the space when new data is inserted. However, after a mass delete operation, it may be helpful to be able to free up the trapped space in the database easily and promptly. Reducing the tablespace/database size will also make backup images smaller and backups faster.
Technically Speaking: What Exactly Are ITC Tables?
ITC tables are created by specifying the keyword “ORGANIZE BY INSERT TIME” in the CREATE TABLE statement. ITC tables cluster regular data based on the insert time of the record, that is, data that is inserted around the same time is placed close together. The ITC table technology is based on multidimensional clustering (MDC) tables (see Multidimensional clustering tables – IBM Documentation). In an SAP environment, MDC tables are only used in SAP Business Warehouse (SAP BW) databases. Compared to MDC tables, ITC tables do not have explicitly defined clustering dimensions. That’s why during the design phase at IBM, these tables were even called “zero dimensional clustering tables”. In ITC tables, the data is stored in the order of INSERTs, and therefore, the data is implicitly clustered by insert time.
ITC Table Layout
ITC tables, like MDC tables, are organized in blocks. Each block consists of a number of pages as determined by the extent size. In a typical SAP system, we use an extent size of 2. Given the standard 16K pages, a block consists of 32K. The rather small extent and block size is an advantage – smaller blocks tend to be completely empty more often than larger blocks. Also, with a smaller extent size, small or empty tables will become much smaller (keep in mind that an SAP ERP ABAP database contains an enormous number of tables).
MDC and ITC tables use a special type of index, so-called block indexes. In the MDC case, block indexes can help to retrieve data very efficiently. In the case of ITC, for technical reasons, one block index is created automatically by the database manager. It is just used to maintain blocks and will not be used by the optimizer.
To manage block allocation and quickly find empty blocks, Db2 creates a block map. The block map is stored as a separate object and contains an entry for each block of the table that specifies its allocation status (free, in use, etc.).
Don’t worry: The block map is a small object that can be neglected in space considerations, and it causes low maintenance overhead.
Space Reclaim in ITC Tables
Space that is freed up by a delete operation in a data page of an ITC table cannot be reused by subsequent inserts. The same is true for MDC and column-organized tables (Db2 BLU Acceleration). However, in an SAP environment, MDC and column-organized tables are only used in SAP BW databases, while regular and ITC tables can also be used in SAP ERP databases.
Restriction: ITC tables cannot be used with the Db2 pureScale Feature.
To compensate the missing space reuse in ITC, MDC, and BLU data pages, completely empty extents can be returned to the tablespace by a non-data-moving REORG RECLAIM operation.
As of version 10.5, Db2 is also able to consolidate sparsely filled BLOBs (binary large objects) in ITC and MDC tables with minimal data movement (see Reorganization enhancements make table maintenance easier – IBM Documentation).
This is also covered by REORG RECLAIM and can also be automatically executed by AUTO REORG operations without intervention by a database administrator.
Using ITC Tables: Example
The following is an example of how ITC tables can be used in an ABAP system. You can reproduce this example in your own ABAP-based (test) system on SAP NetWeaver 7.0 or higher. You can also use ITC tables in JAVA stack systems, but then you don’t have the ABAP utilities I’m using below.
Standard SAP NetWeaver systems contain a report that creates data for the tables that are used in the popular ABAP flight demo example. I’m using this report to create a considerable number of flight bookings that result in entries in the SBOOK table, transform the SBOOK table into an ITC table, and start archiving records from it. Afterwards, I’ll show you the effects of a REORG…RECLAIM EXTENTS operation on the space allocation.
- Let’s start report SAPBC_DATA_GENERATOR:
- To create the example “Monster Data Record”, schedule the report as a batch job with SAP transaction SM36 by using the predefined variant SAP&BC_MONSTER. Monitor the batch job using SAP transaction SM37 and wait until it finishes.
- On the Db2 command line, issue the following SELECT statement (replace SAPMI1 with your ABAP schema name):
db2 "SELECT data_object_p_size, index_object_p_size, reclaimable_space FROM TABLE(SYSPROC.ADMIN_GET_TAB_INFO('SAPMI1', 'SBOOK')) AS T" DATA_OBJECT_P_SIZE INDEX_OBJECT_P_SIZE RECLAIMABLE_SPACE -------------------- -------------------- -------------------- 236384 96992 0
In this example, table SBOOK needs about 230 MB disk space for its data and around 95 MB for its indexes. Not surprising, because so far there is no space to reclaim.
Converting a Table to an ITC Table
Until now, I’ve only inserted data into table SBOOK. Before you perform some deletions by means of data archiving, you need to convert this standard table to an ITC table. In a system based on AS ABAP, the best way to do this is to use the DB6CONV report and perform an online table move:
- Start report DB6CONV and choose “New Conversion”.
- On the “Single Table Conversion” tab page, enter SBOOK as table name, enter the tablespaces that SBOOK is currently using, and choose “Yes” for the “Insert Time Clustering” field:
- Save the data, start the conversion, and monitor it.
- After it finishes, call SAP transaction SE14, enter SBOOK as table name, and choose “Storage Parameters”. This should show that SBOOK is now an ITC table as follows:
- To analyze the effects of space reclamation, it is useful to examine the block map mentioned above. You can do so by using the /DM action of the db2dart tool to generate a textual report that contains the block map of table SBOOK.
NOTE: Running db2dart is an offline operation.Check out the following figure that is based on such a report and shows the current allocation of the first 600 blocks:
The first block is reserved for system usage; all other blocks are filled with data. So, no surprise here.
Archiving Data from table SBOOK
The tables of the ABAP flight demo example have their own archiving object.
- To archive records out of table SBOOK and the other example tables, go to SAP transaction SARA and enter BC_SBOOK as archiving object.
- Assuming the basic archive customizing is in place, select the Write action and create a variant for the write program. Let’s assume you want to archive all flight bookings with a flight date older than 01.01.2012:
By doing so, you archive based on a time-related criterion. This improves your chances to meet the required access pattern for insert time clustering tables (data that is inserted together should be deleted together) because the INSERTs were also based on a time-related criterion (the insert time).
Remember that you can only reclaim space from completely empty blocks. Also note that other archiving criteria are available here such as the connection number or airline that might not be time-related. So, the fact that you’re archiving from a table does not necessarily imply that this table is a suitable candidate for ITC, but the selection criteria in the write program are also an important factor.
- Specify a start date and the spool parameters, run the archiving job, and wait until it finishes.
- After the archiving job has finished, select the Delete action in transaction SARA, choose the archive that was just created, specify the start date and spool parameters, and run the delete job.
After these actions, the table size has not changed, but now there is some reclaimable space:
db2 "SELECT data_object_p_size, index_object_p_size, reclaimable_space FROM TABLE(SYSPROC.ADMIN_GET_TAB_INFO('SAPMI1', 'SBOOK')) AS T" DATA_OBJECT_P_SIZE INDEX_OBJECT_P_SIZE RECLAIMABLE_SPACE -------------------- -------------------- -------------------- 236384 96992 100704
What’s important is that there are groups of completely empty (“FREE”) blocks that can be reclaimed now. Remember that “empty block” here means that all records that are located on the two pages that constitute this block are deleted. The distribution of these blocks is not important and depends on the application (in our case, the report that created the sample data). As you can see, the archiving criterion (flight date) does not perfectly match the insert order of this table, but it is no problem. No matter where the empty blocks are located, you can start reclaiming their space now.
In standard row-based Db2 tables or MDC tables, such free blocks can be reused for upcoming INSERTs into this table. This does not happen with ITC tables for a good reason because it would destroy the insert time order. New records require further block allocations at the end of the block map.
Reclaiming Space from ITC Tables
You perform the following command on the Db2 command line to reclaim unused storage in the table objects for table SBOOK:
db2 "REORG TABLE SAPMI1.SBOOK RECLAIM EXTENTS ALLOW WRITE ACCESS"
Reclaiming extents is fast and can be performed online, with read access or offline. As expected, the above statement reclaims all empty blocks and therefore shrinks the size of the table:
db2 "SELECT data_object_p_size, index_object_p_size, reclaimable_space FROM TABLE(SYSPROC.ADMIN_GET_TAB_INFO('SAPMI1', 'SBOOK')) AS T" DATA_OBJECT_P_SIZE INDEX_OBJECT_P_SIZE RECLAIMABLE_SPACE -------------------- -------------------- -------------------- 135680 96992 0
Blocks that were marked as “FREE” in the block map are now in status “RELEASED”. This means that they are given back to the tablespace and can now be acquired by any table in the tablespace or returned to the file system using an ALTER TABLESPACE command. The block map itself does not shrink; existing entries in the block map with status “RELEASED” can be reused later.
This also shows that the REORG…RECLAIM EXTENTS operation did not include any data movement.
Note that this test was performed on Db2 version 10.1. On Db2 version 10.5 or higher, minimal data movement may occur when Db2 tries to consolidate sparsely filled extents. On the other hand, the table size on Db2 10.5 or higher after a REORG operation may be a little smaller.
Finding Suitable ITC Candidates
You probably want to find out which regular tables are suitable candidates for the conversion to ITC tables.
Typical ITC candidates are:
- Tables from which records are regularly deleted
- Tables that show the typical access pattern (data that is inserted together is also deleted together)
In an SAP system, this pattern applies to tables from which data is archived. Therefore, the DB6CONV report identifies ITC candidates based on the statistics of the SAP archive development kit (ADK). Upon request, DB6CONV examines these statistics and creates a list of all tables for which the following applies:
- They are not already ITC or MDC tables
- At least 1 GB in size
- They are part of an archiving object for which at least one completed archiving session exists. Usually this means that data was actually deleted from this table, although exceptions from this rule are possible.
To access this functionality, do the following:
- In the DB6CONV report, choose New Conversion and go to the Database-Level tab page.
- Choose the radio button Convert Insert Time Clustering (ITC) Candidates to ITC Tables.
- Choose Create Queue.
- Assuming the SBOOK table in the example above is larger than 1 GB and has not already been converted to an ITC table, a dialog box such as the following is displayed if you choose the Display/Edit Queue button:
This tells you that table SBOOK with the current size of 1.2 GB that belongs to the archiving object BC_SBOOK was identified as a candidate for ITC conversion. 593.691 BC_SBOOK objects were archived and deleted. Note that this does not necessarily mean that the same number of records were deleted from table SBOOK. The table resides in a tablespace with an extent size of 32 KB. The extent size is an indicator for how likely it is that there will be completely empty blocks after archiving sessions.
- Now it’s your decision if you actually convert the table to ITC by setting column ITC to YES as in the screenshot above, or not (set column ITC to NO).
Keep in mind that the DB6CONV report may not be able to determine all ITC candidates. There may be other tables that show the ITC access pattern but do not have an archiving object. For example, queue tables that store a work list of data. Such tables are empty after the work list was processed but they may become very large in exceptional cases.
So, apart from ITC candidates suggested by DB6CONV, it is still possible to convert any standard table to ITC if you observe a suitable access pattern on this specific table.
Automation of Space Reclamation in ITC Tables
You don’t have to manually execute REORG…RECLAIM EXTENTS on your ITC and MDC tables.
Instead, you can use Db2 Automatic Reorg to execute this task. The RECLAIM EXTENTS functionality for ITC and MDC tables can be added to the automatic REORG XML policy of your database.
It’s part of the standard automatic REORG policy recommended by SAP.
To check and maintain the current automatic REORG settings, you can use the DBA Cockpit.
1. Call transaction DBACOCKPIT.
- Choose Configuration→Automatic Maintenance->Automatic Reorg and go to the Automatic REORG tab page where you find the following settings:
For the minimum SAP BASIS Support Package levels that are required for this screen, see SAP Note 2171465.
As an alternative, you can run the following statement on the Db2 command line to fetch the current version of the automatic reorganization policy into <policy_file>:
db2 "CALL SYSPROC.AUTOMAINT_GET_POLICYFILE('AUTO_REORG', ‘<policy_file>’)"
If you use this statement as Db2 instance owner, the <policy_file> is afterwards located in the home directory of the instance owner under sqllib/tmp. The following is an example of a reorg automation policy:
<?xml version="1.0" encoding="UTF-8"?> <DB2AutoReorgPolicy xmlns="http://www.ibm.com/xmlns/prod/db2/autonomic/config"> <ReorgOptions dictionaryOption="Rebuild" indexReorgMode="Online" useSystemTempTableSpace="true” reclaimExtentsSizeForTables="100000” reclaimExtentsSizeForIndexObjects="50000" /> <ReorgTableScope maxOfflineReorgTableSize="1"> <FilterClause>TABSCHEMA NOT LIKE 'SYS%' AND (TABSCHEMA,TABNAME) NOT IN (SELECT TABSCHEMA, TABNAME FROM SYSCAT.EVENTTABLES) </FilterClause> </ReorgTableScope> </DB2AutoReorgPolicy>
In this example, the reclaimExtentsSizeForTables attribute specifies that the RECLAIM EXTENTS operation runs automatically for MDC and ITC tables if 100 MB or more free space can be reclaimed. The reclaimExtenstsSizeForTable attribute considers ITC and MDC tables.
You can edit this file manually and store it again on the Db2 server with the help of the AUTOMAINT_SET_ POLICY stored procedure.
The ITC table type allows for an easy, light-weight space reclamation with minimal data movement for tables where data is regularly deleted. Even before the space is reused by forthcoming inserts into the same table, this freed-up space can be reused by other tables in the same tablespace or even be released to the file system by an ALTER TABLESPACE operation.
Consider using ITC tables if the following applies to your system:
- You have tables with large amounts of data.
- You have tables with regularly deleted data (for example, in the case of regular archiving runs).
- You want to reduce the size of your database and database backup images to a minimum.
Please feel free to discuss your experience with ITC tables in the comment section of this blog post.
SAP Note 2171465