One of the most boring topics within the area of DBMS surely is locking.
It's also one of the topics that most DB users try to avoid as much as possible.
Still it's utmost important to correctly understand how locking works in the database of choice to be able to work sucessfully with it and without bad surprises.
Here are two cases of MaxDB locking behaviour that is not obvious at all (and also not well documented, yet).
Note: the first part "grew" while I was researching about it, so I split this blog into two parts. The second part will be released within the next few days.
With the advent of the filedirectory counters a neat query optimization for the popular 'select count(*) ...' was implemented.
Whenever the optimizer realizes that the user just wants to know the total number of rows in a table this number is retrieved from the filedirectory counter of the table and not from the table itself.
The performance gain is immense: instead of reading potential thousand of pages and count the rows that are found, MaxDB just looks up the counter and returns the number to the user.
But what happens when there is a transaction open during this query?
Let's see:
|
|
|
|
|
As we see select count(*) returned the count including the additional 10 rows, although they never were commited!
Luckily the explain tells us about the optimization by printing out the "COUNT OPTIMIZATION" line.
But how can we force MaxDB to actually deliver the result for committed rows?
Until writing this blog I thought the solution would simply be to trick the optimizer out of the COUNT OPTIMIZATION, e.g. by adding a dummy predicate that will evaluate to TRUE for all rows, let's say 1=1.
Let's see how this works:
|
Ok, the "COUNT OPTIMIZATION" is gone, so we should expect the 'correct' counting of data...
|
|
WOW - wait a minute!
Still the same wrong result?
YES!
So I've written nonsense above?
Nope, what I wrote is actually true.
So how come that we still see the uncommitted data?
The point here is, that there are some more optimizations build in that are all but obvious.
When applying the dummy-predicate MaxDB actually will walk down through the whole table.
So far the explain is completely correct.
But what happens with the rows read is what makes up the odd behavior.
For each row found the kernel now checks whether the row is qualified against the predicates.
Part of this is to recognize that just counting is wanted and no actual data from the table should be returned.
So the kernel takes a shortcut and reports back that the row did not fit but increments a counter intead. This is then done with all rows in the table.
The keypoint here is to know that the kernel does only check the locks of a row once it has been recognized as "qualified row".
Therefore this (obviously old) count(*) optimization shortcuts before the lock handling thereby always delivering uncommitted read-results.
Another way to fool the optimizer out of the set of optimizations is to use "count(column_name)" instead of "count(*)".
The problem with that is that for "count(column_name)" only those rows are counted, for which the column is not a NULL value.
This might not be a problem for most SAP tables, but surely can be an issue for others.
Basically only two "safe" ways exist to force the committed read-count.
In both of the following cases the kernel really reads all rows and really needs to check locking but checking for entries in the lock lists.
The two ways are:
In our example column "THE_ROW" forms the primary key, so let's check this counting approach:
|
|
If we change the table and add a new column which might contain NULL values we can observe how fragile this approach can become:
|
Now set some values...
|
If we now just use THE_ROW2 for counting we will get the number of columns that are NOT NULL, regardless of the locking state of the rows:
|
And, of course, locking would be adhered in this case.
That means that without the commit after the update a second session doing the count(the_row2) would simply be blocked until the commit has been issued (when we assume to work at least in isolation mode 1 - committed read).
|
|
NOTE: Here, session 1 is also hanging, until session 2 commits or roles back!
That's it about the count(*) locking effects that may occur with MaxDB.
Quite strange stuff going on here, isn't it?
So why aren't there problems with MaxDB and the SAP products all the time?
That's simply because the NetWeaver application servers use uncommited read (a.k.a. isolation level 0) all the time!
The data access layer is designed to know that rows that are found in the database can already be outdated by then. Since there is a whole business transaction handling engine implemented in the NetWeaver stack ("above" the DBMS) it really doesn't matter at all for the work processes.
Whenever a user needs to 'really' put his hands on a row the application needs to get a enqueue anyhow and that ensures that the correct values are retrieved and now other users can change them while the user is working on the data.
And oh, yes, this is just another big argument against direct database access ...
continue reading about locking behavior in part II of this blog, coming up this week.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
37 | |
25 | |
17 | |
13 | |
7 | |
7 | |
7 | |
6 | |
6 | |
6 |