Additional Blogs by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member

There has been some confusion in database-land about the terms

  • Log Volume
  • Data Volume
  • Log backup files
  • archived
  • written
  • committed
  • rollback
  • savepoint
  • point-in-time recovery

People tend to address this confusion by defining the terms. But I would like to tell you a story about the life of a data block that gets into a HANA database. Note that my explanations are specific to the HANA database management system running in a Dell single-node appliance but shed a light to "all" SQL databases.

To keep it simple let's start with a hello world example. First we create a table world:

dellhana:~ # hdbsql -u system -p mypassword -i 42

Welcome to the SAP HANA Database interactive terminal.

Type:  \h for help with commands         

       \q to quit                        

hdbsql=> create table world ( c varchar(20) )

Then we add a line:

insert into world values('hello')

This is now an uncommitted (database) transaction. To get it onto disk use the command

commit

Whenever you commit an insert, update or delete statement the data is written to disk, into the log volume (/hana/log), in a file called a log segment. Log segments are by default 1GB:

dellhana:~ # ls -ltr /hana/log/mnt00001/hdb00002

-rw------- 1 anaadm sapsys 1073741824 Aug  6 12:12 logsegment_000_00000024.dat

-rw------- 1 anaadm sapsys    1359872 Aug  6 12:14 logsegment_000_directory.dat

-rw------- 1 anaadm sapsys 1073741824 Aug  8 12:07 logsegment_000_00000025.dat

In this example mnt00001 is the first node in the HANA cluster (there is only one as we are talking about single-node HANA). hdb00002 belongs to the indexserver service (find this out with hdbstudio -> Administration Console -> Volumes). It is August 8 and the current log segment is number 25. There you can find our "hello" string:

dellhana:~ # strings /hana/log/mnt00001/hdb00002/logsegment_000_00000025.dat | grep hello

hello

Once the log segment containing the "hello" string is filled, a new log segment will be started. The old segment will get a flag as "to be archived". When the next log backup is performed, the "hello" segment will be written and the "to be archived" flag will be removed. The backup is called "archive log" in Oracle lingo and "log backup files" in HANA lingo. When your redo log volume runs full, the log segments that have been archived will be overwritten. If there are no archived log segments left, logmode normal will stop operations in order to allow corrective actions. Logmode overwrite will overwrite log segments even if they have never been archived. Neither logmode will overwrite data that is needed for HANA to start up.

Find out your HANA's log mode by running the SQL command:

SELECT value FROM "SYS"."M_INIFILE_CONTENTS" WHERE KEY='log_mode'

Once a commit is done after the insert statement, other transactions can read this new table row. Also, during the next savepoint, "hello" will be written into the table "world" in the data volume. By default, HANA performs a savepoint every 5 minutes. Having a short interval for the savepoints will not only reduce data loss when the log volume fails, but also lead to faster start-up times of the HANA database.

Why does a shorter savepoint interval reduce database startup times? Let's imagine you issue

insert into world values('hello')

commit

Now the redo logs contain an information like "table world, first row is 'hello' starting 3:01pm". When the system reboots at 3:02pm, the database will have to recover. The database will redo all committed transactions from before reboot and after the last savepoint. Then it will rollback all the uncommitted transactions from the same timeframe. The shorter that timeframe is the less needs to be done.

The above example also shows how point-in-time recovery works. When you keep one data backup and all log backups you will be able to tell the value of any cell in any table of your database at a specific time. This first row in table "world" is empty at 3:00pm. And at 3:02pm it contains "hello" (in the above example).

And if you now imagine that you restore one backup at a secondary site and ship the stream of log backups from the primary to the secondary site you understand asynchroneous datacenter mirroring by transaction log shipping.

Now our data block "hello" has been inserted, committed, written to the redo log volume, has survived a reboot, was saved to the data volume, backed up to the archive logs and transmitted via transaction log shipping. This story has helped me in recent times to memorize the terms correctly and should help you to lead an educated discussion.

2 Comments