Currently, most enterprise database productions are designed leveraging OLTP or OLAP because of limitation of its architecture. Since they could not fulfill the both performance of OLTP and OLAP, enterprises would like to use systems of different database to satisfy their business needs. Data exists in different database so that the transaction system and analysis system are separated, which makes data management rather difficult.
SAP HANA, as next generation In-memory database, was initially designed for a perfect combination of OLTP and OLAP by SAP.
SAP HANA introduces the concept of “Column” table. We know records are sequential stored in data block of traditional row table; while in column table, each column is stored sequentially in data block. This architecture makes aggregation operation of columns such as sum, count very fast in column table. In OLTP part, column table will be affected due to adjustment data structure in data block when the records are changed.
To solve the OLTP performance problems of “Column” table, a special data area called Delta area is used for every column table in SAP HANA. Every column table has two areas for data storing: Main area and Delta area. Data in Delta area is stored by row and the updating data of OLTP transactions will be put into this area first. When the data meets a certain condition, a merge operation will take place to move data in delta are to main area together and the row data will convert to column format. That is what SAP HANA designs for both OLTP and OLAP in one database.
In the perspective of application, we can have many other ways to improve the performance of OLTP on column table.
Disable Auto Delta Merge
“Column” table makes a “Merge” operation with Main area when data existing in Delta area grows enough large. When performing it, Delta area is locked, all OLTP transactions will be blocked and data will not be update during that period. When “Column” table is created, Automatic Delta Merge is enabled, which makes “Merge” unpredictable. If frequent transactions would occurs in that period, the performance shrinks significantly.
The recommendation is disabling auto Delta Merge on the related tables via following SQL command.
alter table [table_name] disable automerge;
Also, this feature could be reset to enable, after that database will monitor the Delta area to check if the condition satisfies merge operation.
alter table [table_name] enable automerge;
Since data in Delta area is stored in row-format while it is column-format in Main area, the performance is not good when executing operations such as table-join. Two segments of data in different store format will be processed and the calculator always switches between column engine and row engine, which results in bad performance. We recommend doing a manual merge when data in Delta area is large enough or after executing sequences of transactions.
merge delta of [table_name];
In most time, an OLTP transaction is a light operation in database, which spends only milliseconds, so the performance increases little in single thread. In database, a record locks when inserting, updating or deleting the record of one table. These records do not affect each other so the transaction won’t be blocked in multithread. Besides, the waiting time of CPU and response time of database shrinks and the performance becomes better.
Generally speaking, the performance improves when threads increases, but due to the limitation of server and databases self, the performance has only slightly changes after a certain status and remains unchanged finally.
Creating connections through JDBC or ODBC driver, or opening multiple consoles with hdbsql of SAP HANA could be used to make multithreads.
For a single-row insertion, we could directly set up multiple database connections for paralleling. Another case is that we need a bulk insertion like “insert into [table1] select * from [table2]” with large data in table2. Since executing this SQL will use only one thread, we should divide them into several parts by columns. For example, a table contains an ID column and could be split according to column value distributions, the origin statement could be replaced with” insert into [table1] select * from [table2] where ID like ‘%[0-9]’”. The series of statements could run parallel using multithreads and the performance will be improved.
Partitions only apply to Column table in SAP HANA. It has several advantages:
1. Data distributes in different partitions and multi-threads make effect, each thread processes data for paralleling.
2. Table could be split based on business scenario. For example, partitions on years of history sales data could be based on months, so that database will only search one partition when analyzing sales data on certain month to reduce the data size.
3. The data for OLTP distributes in different partition to prevent frequently writing on one data area.
SAP HANA provides three partition types: Hash partition, Range partition and RoundRobin partition. Hash and Range are designed for one column or union columns and the difference between them is Hash partition is based on the hash value of columns and Range partition defines value range of columns. Generally speaking, Range partition is always created for data typed column and Hash partition is for ID column. RoundRobin partition divides the records into partitions randomly so this is most uniform partition type.
One best practice is using partitions and multithreads in combination. Not only OLTP transactions apply for this method, but operations like delta merge could also benefit from it. We can use “merge delta of [table_name] part [partid]” to merge data for each partition.
Meanwhile, partitions take cost for database. The data volume and log volume increase size for storing additional information of partitions.
SAP HANA provides many data access drivers, such as ODBC, JDBC and MDX for extent application. MDX is mainly for molder views in SAPHANA and JDBC, ODBC is more often used in OLTP. For JDBC driver, some tips are useful in improving OLTP performance.
Closing Auto Commit
Compose your database operations in one transaction and commit manually.
Commit More Records Once
Every commit has its cost. Try to reduce the commit times if condition permits.
Using Batch Commit
In JDBC, using batch commit could reduce the interaction times when transferring data between database and programs.
conn = DriverManager.getConnection (connection, username, password);
stmt = conn.createStatement();