MaxDB - On locking mechanisms and how we get to kn...

lbreddemann · ‎10-14-2009

One of the most boring topics within the area of DBMS surely is locking.
It's also one of the topics that most DB users try to avoid as much as possible.

Still it's utmost important to correctly understand how locking works in the database of choice to be able to work sucessfully with it and without bad surprises.

Here are two cases of MaxDB locking behaviour that is not obvious at all (and also not well documented, yet).
Note: the first part "grew" while I was researching about it, so I split this blog into two parts. The second part will be released within the next few days.

1) When not to trust count(*)

With the advent of the filedirectory counters a neat query optimization for the popular 'select count(*) ...' was implemented.
Whenever the optimizer realizes that the user just wants to know the total number of rows in a table this number is retrieved from the filedirectory counter of the table and not from the table itself.

The performance gain is immense: instead of reading potential thousand of pages and count the rows that are found, MaxDB just looks up the counter and returns the number to the user.

But what happens when there is a transaction open during this query?
Let's see:

####Session 1:
select * from locktest

THE_ROW
-------
1
10
2
3
4
5
6
7
8
9


select count(*) as cnt from locktest

CNT
---
10


explain select count(*) as cnt from locktest

SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                        COUNT OPTIMIZATION                      
                                             RESULT IS COPIED   , COSTVALUE IS           2
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10

####Session 1:
select count(*) as cnt from locktest

CNT
---
20

####Session 2:
rollback
Statement successfully executed. No Result

####Session 1:
select count(*) as cnt from locktest

CNT
---
10

As we see select count(*) returned the count including the additional 10 rows, although they never were commited!
Luckily the explain tells us about the optimization by printing out the "COUNT OPTIMIZATION" line.

A way out ?

But how can we force MaxDB to actually deliver the result for committed rows?
Until writing this blog I thought the solution would simply be to trick the optimizer out of the COUNT OPTIMIZATION, e.g. by adding a dummy predicate that will evaluate to TRUE for all rows, let's say 1=1.

Let's see how this works:

####Session 1:
explain select count(*) as cnt from locktest where 1=1
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

Ok, the "COUNT OPTIMIZATION" is gone, so we should expect the 'correct' counting of data...

####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10

####Session 1:
select count(*) as cnt from locktest where 1=1

CNT
---
20

WOW - wait a minute!
Still the same wrong result?
YES!
So I've written nonsense above?

Nope, what I wrote is actually true.
So how come that we still see the uncommitted data?

The point here is, that there are some more optimizations build in that are all but obvious.

Clever tricks

When applying the dummy-predicate MaxDB actually will walk down through the whole table.
So far the explain is completely correct.

But what happens with the rows read is what makes up the odd behavior.
For each row found the kernel now checks whether the row is qualified against the predicates.
Part of this is to recognize that just counting is wanted and no actual data from the table should be returned.
So the kernel takes a shortcut and reports back that the row did not fit but increments a counter intead. This is then done with all rows in the table.

The keypoint here is to know that the kernel does only check the locks of a row once it has been recognized as "qualified row".
Therefore this (obviously old) count(*) optimization shortcuts before the lock handling thereby always delivering uncommitted read-results.

Phew... weired isn't it? But things are getting creepier!

Another way to fool the optimizer out of the set of optimizations is to use "count(column_name)" instead of "count(*)".
The problem with that is that for "count(column_name)" only those rows are counted, for which the column is not a NULL value.
This might not be a problem for most SAP tables, but surely can be an issue for others.

Basically only two "safe" ways exist to force the committed read-count.
In both of the following cases the kernel really reads all rows and really needs to check locking but checking for entries in the lock lists.
The two ways are:

1) using "count(pk_column_name)"

In our example column "THE_ROW" forms the primary key, so let's check this counting approach:

####Session 2:
insert into locktest (select the_row+20 from locktest)
More than one row updated or deleted. Affected Rows:  10

###Session 1:
select  count(the_row) as cnt from locktest

CNT
---
10

SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

If we change the table and add a new column which might contain NULL values we can observe how fragile this approach can become:

####Session 2:
rollback
alter table locktest add (the_row2 varchar(20))
commit

Now set some values...

update locktest set the_row2='x' where the_row >='5'
More than one row updated or deleted. Affected Rows:  5
commit

select * from locktest
THE_ROW  THE_ROW2
-------  --------
1        ?
10       ?
2        ?
3        ?
4        ?
5        x
6        x
7        x
8        x
9        x

If we now just use THE_ROW2 for counting we will get the number of columns that are NOT NULL, regardless of the locking state of the rows:

select count(the_row2) as cnt from locktest

CNT
---
5

And, of course, locking would be adhered in this case.
That means that without the commit after the update a second session doing the count(the_row2) would simply be blocked until the commit has been issued (when we assume to work at least in isolation mode 1 - committed read).

2) using multiple aggregate-functions at once e.g. "count(), count()"

####Session 2:
insert into locktest (select the_row+20, the_row2 from locktest)
More than one row updated or deleted. Affected Rows:  10

####Session 1:
select count(*) as cnt_1, count(*) as cnt_2 from locktest

CNT_1   CNT_2
-----   -----
   10      10
   
SCHEMANAME  TABLENAME  COLUMN_OR_INDEX  STRATEGY                                PAGECOUNT
------------------------------------------------------------------------------------------
LARS        LOCKTEST                    TABLE SCAN                                       1
                                             RESULT IS COPIED   , COSTVALUE IS           3
                                        QUERYREWRITE - APPLIED RULES:           
                                           DistinctPullUp                                1

NOTE: Here, session 1 is also hanging, until session 2 commits or roles back!

That's it (for now)

That's it about the count(*) locking effects that may occur with MaxDB.
Quite strange stuff going on here, isn't it?

So why aren't there problems with MaxDB and the SAP products all the time?
That's simply because the NetWeaver application servers use uncommited read (a.k.a. isolation level 0) all the time!

The data access layer is designed to know that rows that are found in the database can already be outdated by then. Since there is a whole business transaction handling engine implemented in the NetWeaver stack ("above" the DBMS) it really doesn't matter at all for the work processes.
Whenever a user needs to 'really' put his hands on a row the application needs to get a enqueue anyhow and that ensures that the correct values are retrieved and now other users can change them while the user is working on the data.

And oh, yes, this is just another big argument against direct database access ...

continue reading about locking behavior in part II of this blog, coming up this week.