Additional Blogs by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member
0 Kudos

There is almost no customer system that was not affected at some point in time by a deadlock situation, even if many users and administrators are not aware of this. This blog series will focus on the different types of deadlocks and what can be done in case a particular deadlock occurred.

This article will cover the simple case of deadlocks caused by row level locks, which are usually the result of application problems (wrong parallelization) or programming errors.

What is a deadlock?

A deadlock can occur when two or more users are waiting for a data locked by each other. Oracle automatically detects deadlock situations and resolves them by rolling back one of the statements involved in the deadlock, thereby releasing one set of the conflicting row locks.

This can be demonstrated pretty easy via 2 sqlplus sessions:

SESSION 69:

SQL> update sapnw2.sflight set price='100' where carrid = 'LH';

10 rows updated.

SESSION 67:

SQL> select * from sapnw2.nriv where object = 'RESB' for update;

We can have a look at v$lock in order to have a more detailed overview about the locks:

SQL> select * from v$lock where type = 'TX';

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
000000006A875E50 000000006A875FD8         69 TX     524293       1805          6          0       1017          0
000000006A8986A8 000000006A898830         67 TX     393219       1867          6          0        624          0

The locks of the two sessions are hold with LMOD 6, which means 'exclusive'.


SESSION 69:

SQL> select * from sapnw2.nriv where object = 'RESB' for update;

Of course, this session now appears to 'hang', since session 67 sill holds a lock on the requested rows:

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
000000006BF56B88 000000006BF56BA8         69 TX     393219       1867          0          6         57          0
000000006A875E50 000000006A875FD8         69 TX     524293       1805          6          0       1422          0
000000006A8986A8 000000006A898830         67 TX     393219       1867          6          0       1029          1

We now see a new entry for Session 69, this time with REQUEST = 6, this means that the session now requests an exclusive lock.

SESSION 67:

SQL> update sapnw2.sflight set price = '150' where carrid = 'LH';

This will cause the following error in session 69:

ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

Although the example appears to be rather 'artificial' (sflight is not the typical business relevant table), deadlock and wait situations that involve the NRIV table are very frequent in SAP systems. There will be a separate article dealing exclusively with NRIV at a later point in time.

Analysis steps

Usually, you are given the task of analyzing this kind of problem post mortem: The issue has already occurred and should not occur again, especially, when for example a business critical job has terminated. Our first entry point for problem analysis is the Oracle alert log:

Sun Nov 25 08:54:16 2007
ORA-00060: Deadlock detected. More info in file /oracle/NW2/saptrace/usertrace/nw2_ora_31341.trc.

The mentioned trace file contains the deadlock graph, which looks like this:

Deadlock graph:

                       ---------Blocker(s)--------  ---------Waiter(s)---------

Resource Name          process session holds waits  process session holds waits

TX-00080005-0000070d        13      69     X             16      67                 X

TX-00060003-0000074b        16      67     X             13      69                 X

session 69: DID 0001-000D-00000210      session 67: DID 0001-0010-000001B5

session 67: DID 0001-0010-000001B5      session 69: DID 0001-000D-00000210

We can see the following here:
SID 69 holds the lock TX-00080005-0000070d in X mode
SID 69 waits for TX-00060003-0000074b in X mode

SID 67 holds the lock TX-00060003-0000074b in X mode
SID 67 waits for TX-00080005-0000070d in X mode

You will also notice that the waits are always in 'X' mode. This means that we are dealing with row level locks. The other possibilities will be shown in the next blog. Please note that in case of values not equal 'X', totally different analysis methods are necessary.

Rows waited on:
Session 67: obj - rowid = 0000B8EA - AAALjqAAGAAAWN9AAC
  (dictionary objn - 47338, file - 6, block - 91005, slot - 2)
Session 69: obj - rowid = 00006367 - AAAGNnAAGAAAqpGAAl
  (dictionary objn - 25447, file - 6, block - 174662, slot - 37)

This is very important information. First of all, we can determine which objects are involved:

SQL> select object_name, object_type from dba_objects where object_id = '47338';
 
SFLIGHT TABLE 

SQL> select object_name, object_type from dba_objects where object_id = '25447';

NRIV TABLE

From an application point of view, it is perhaps also help full to investigate the affected rows:

select * from sapnw2.sflight where rowid = 'AAALjqAAGAAAWN9AAC';

select * from sapnw2.nriv  where rowid = 'AAAGNnAAGAAAqpGAAl'


Information on the OTHER waiting sessions:
Session 67:
  pid=16 serial=459 audsid=-1 user: 0/SYS
  O/S info: user: nw2adm, term: pts/1, ospid: 31488, machine: nwtest
            program: sqlplus@nwtest (TNS V1-V3)
  application name: sqlplus@nwtest (TNS V1-V3), hash value=0
  Current SQL Statement:
  update sapnw2.sflight set price = '150' where carrid = 'LH'
End of information on OTHER waiting sessions.

Only when knowing both sides of the problem, you can draw the complete picture. When it comes to SAP environments, this means the following:

  • The session that was terminated because of the ORA-00060 will generate a short dump (ABAP - Transaction ST22) or a Call Stack (JAVA - Server log). This would reveal one of the involved programs/transactions.
  • The session that continues because the other transaction was rolled back due to the deadlock is not immediately known to us. But the 'application name' part of the 'OTHER waiting sessions' section is populated with the name of the ABAP program / JAVA class that is responsible for the problem too. Of course, since my test case was done via sqlplus, this is not too helpful.
What can I do now?
  • In case one of the two sides involve customer own coding, get in contact with the responsible developer and forward the information that you have extracted from the deadlock graph to him.
  • In case SAP standard coding is involved, open a customer message on the  marketplace. 
    Very important: Open the message on the component to which one of the affected programs belongs, not on the component BC-DB-ORA. This kind of problem (deadlock due to row level locks) can only be investigated from an application perspective.
    Update your problem description with the information extracted from the deadlock graph.