Skip to Content

What do the little grey cells say?  They tell me this upgrade will be just another dusty manila-jacketed case file in a few weeks, just one more notch on the barrel of a never-ending line of patches, large and small, to one or the other of the house of punch cards that support the glass towers in the silos of industry. Back when ERP 6 was the bees knees, all the shills said “you’ll never patch in this town again.” I’m planning to stick around long enough to prove them wrong.

In the last episode of The ABAP Detective (Part 1 of this saga), I penned the story from the viewpoint of the source code system, the development box, the thinker’s statue.  This time, days before the production down time, I’ll look at the volume testing and code base of the production copy. But before that, what’s the mystery here? Do I even have a case, or am I dreaming?

Figure 1

TST-20121210-A1.png

The heavy hitters in a transaction system are the reports that run in parallel, like 18-wheelers jolting down a New York side street with a load of flowers, or the sanitation trucks that take away the same vegetation a few days later. Or they’re the unnecessary buzz of the crowd outside a dance hall, magpies chattering about nothing. Or just the strident jolt of a stack of plates in a diner when the shifts change. For figure one, it’s a SQL statement, seen in all of its glory via ST04.

What does this show about an ECC upgrade? In my case book, it’s the same line as before, a buffer that runneth over.  Harmless by itself, but a portend of a tuning opportunity. I’d need to know how much it runs, how much it costs, whether it’s worth redemption.

Figure 2a

TST-20121210-B1.png

Figure 2b

TST-20121210-B2.png

Two rows down from the table buffer overrun (which ran 600 times and pulled in 25 million rows…) is a statement against CDHDR.  So, yeah, the new system has the same guts as the old system, change pointers, index into a cluster table, and lots and lots of data.

SELECT

   *

FROM

   “CDHDR”

WHERE

   “MANDANT”=:A0 AND “OBJECTCLAS”=:A1 AND “OBJECTID”=:A2 AND (“UDATE”=:A3 AND “UTIME”>=:A4 OR “UDATE”>:A5 ) AND

(“UDATE”=:A6 AND “UTIME”<=:A7 OR “UDATE”<:A8 )


So, good statement, or bad statement?  It’s got a few of the key fields, and a date/time range that would work best if the user or code out the screws to the limits.  If not, leaving the field wide open, maybe bad.  We’d need to see the literals, not this bound version.  What does the optimizer do?

SELECT STATEMENT ( Estimated Costs = 2 , Estimated #Rows = 0

         2 TABLE ACCESS BY INDEX ROWID CDHDR

           ( Estim. Costs = 1 , Estim. #Rows = 1 )

           Estim. CPU-Costs = 7,368 Estim. IO-Costs = 1

           Filter Predicates

             1 INDEX RANGE SCAN CDHDR~0

               ( Estim. Costs = 1 , Estim. #Rows = 1 )

               Search Columns: 3

               Estim. CPU-Costs = 5,777 Estim. IO-Costs = 1

               Access Predicates

Primary key, hoping for the best. There aren’t a lot of suspects to choose from in this field.

MANDANT                                      2
OBJECTCLAS                                  62
OBJECTID                            2,138,363
CHANGENR                          279,045,703

Back of the envelope math (would need to do frequency analysis via DB05 for more dirt) – almost 300 million records, key fields limit the selection to about 1 percent of that (OBJECTID is specified) so maybe we only get 100 hundred rows to sift through. Maybe.

Figure 3

TST-20121210-C1.png

Figure 4

TST-20121221-A1.png

Flipping ahead a few days to look at the system logs, rather than the shared SQL cursor area.  My notebook shows the trash cans and recycle bins overflowed, causinga bit of havoc as the stat files could not be updated.  A minor transgression, but once again, the underlying data structures are pretty much the same as the last several generations of this code.

Looking at the case from a different angle, the operator view of the system logs:

dev_w11

M Wed Dec 19 17:29:49 2012

M  ***LOG R4F=> PfWriteIntoFile, PfWrStatFile failed ( 0028) [pfxxstat.c   5445]

M  ***LOG R2A=> PfWriteIntoFile, PfWrStatFile failed (/usr/sap/???/DVEBMGS02/data/stat) [pfxxstat.c   5450]

M

M Wed Dec 19 17:33:59 2012

M  ***LOG R4F=> PfWriteIntoFile, PfWrStatFile failed ( 0028) [pfxxstat.c   5445]

M  ***LOG R2A=> PfWriteIntoFile, PfWrStatFile failed (/usr/sap/???/DVEBMGS02/data/stat) [pfxxstat.c   5450]

Yup, stat file nolo contendre.  Here’s another view, from the UNIX operation level:

$ df -g  /usr/sap/???/DVEBMGS02/data/stat

Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on

/dev/???sap???01lv      2.00      0.00  100%    26719    99% /usr/sap/???

Figure 5

TST-20121221-B1.png

Figure 5 shows more system log noise, with an enqueue_read (sorry, enque_read) that failed with a bad user id.   SCN shows 4 hits for this error, with no clear resolution (at least to this ABAP Detective).  Hope it gets better soon.

Figure 6

TST-20121221-C1.png

Back to some basic database / application server inter-operability checks. Table buffers.  Drab stuff, but users like fast buffers.  Oh, that’s right, keep the data tables in memory.  Sound familiar?  Where did I hear that before?

This view is sorted by invalidations; usual suspects at the top.  The rogues to be fixed are hiding a little farther down.

Figure 7

TST-20121221-D1.png

Generic key buffer – my favorite.  Okay, the quality system isn’t as big as production.  We’d want more than 30 MB available for common data on each application server.

Figure 8

TST-20121221-E1.png

A cross-town taxi ride to SM66, Global Work Process Overview, shows the busy traffic as would be expected in an enterprise system. Mostly direct reads (meaning waiting for the database to respond), and interactive sessions doing communications.  Not bad, but those 10-thousand second reports need pruning eventually.

Figure 9

TST-20121221-F1.png

Flipping back to ST04, database cache SQL tuning, I’ve opened a background check on the VBEP table.  What is it doing for 18K seconds in a Z program?

Figure 10

TST-20121221-G1.png

One execution, a million disk reads, 45 rows returned.  Great.  A lot of noise from such a pretty little gun.

Figure 11

TST-20121221-H1.png

Urm, why did VBAK show up here?

TST-20121221-H2.png

Ah, there’s VBEP.  Along with VBAP and VBAK.  Three major table join, a few million rows in each.  It’s Z code.  You know.

Figure 12

TST-20121221-I2.png

Last shot, then we’ll put this news rag to bed.  Z program, joining VBEP, VBAK and VBAP, just like it says. Into an internal table – classic ABAP styling. All it needs is double-breasted brass buttons and a feather in the hat band.

There’s a database hint embedded – not for the faint of heart. Whether the hint still makes sense in the current database version is left as an exercise for the student. That’s you.  What do you think?

PART 1 – The ABAP Detective And The Degraded Upgrade – PART 1

PART 3 – The ABAP Detective And The Degraded Upgrade – PART 3

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Lars Breddemann

    Well written and entertaining – just as usual ๐Ÿ™‚ .

    And the case presented is so very typical for any system that is not in it’s ‘out-of-the-box’ state anymore.

    Quite obviously, the author of those hints did not expect hash joins to make it into the “allowed for OLTP” feature list…

    But to give her/him credit: it’s pretty damn hard to shoot straight on moving targets like NetWeaver on <any-DBMS>.

    thanks for that!

    (0) 

Leave a Reply