Skip to Content

The prior chapter in this blog series ‘As The Ticket Turns’ generated a plethora of comments, more than any of my recent blogs.  At the risk of interpreting others meaning,  I would break them into 2 threads:

1) Ticket processing
2) Database tuning

I’ll add 2 more topics to address here:

3) BW technical content
4) Data prevention

We’ve made progress on resolving this issue since my first post, but as I write this we’re not there yet.  I would say that the suggestions are getting better, a typical trend as a complex problem is kicked around until it finally reaches the party that has an idea what is wrong.

 

 (1) Ticket processing


I expressed my opinion in my Solution Manager webcast that the tool allows SAP help personnel to give us the wrong answer a lot faster.

There’s an emphasis on ticket completion time, which is understandable as customers complain about unaddressed issues as much as wrongly addressed ones.

This ticket started out in queue “SV-BO-BI”.  That seems to be somewhere between purgatory and limbo.

One suggestion to fix the problem was that we had database space management fouled up:

‘The settings for all the above tablespaces is “AutoExtend: Off”.’

Um, it’s slow.  It’s not filled up.  And by the way, we have sophisticated space management tools that we built ourselves (Power Tools, you know?) that work with our enterprise storage system.  

Lately, as we zoom closer to the root case, the ticket moved to the “BW-BCT-TCT” and “BW-BCT-TCT-BAC” queues.  The analysts looked a little closer at the class / function module that is sucking up all the time, namely “CL_RSTCT_BIRS_OLAP_AGGR_EX.”  I could have told you that.  Oh, wait, I did.

Today’s suggestion (“SAP proposed solution”) is to apply a note.

Alas, the note is inaccessible:

The requested SAP Note is either in reworking or is released internally only

Others have complimented or criticized SAP for the turnaround time and quality of recent message solving.  I will no doubt paste links to these blogs into the “positive call closure” survey, assuming that the person who opened the ticket doesn’t close it ahead of time.

 

 (2) Database tuning

Before we look deeper into the statement I flagged previously, I want to talk about “The New New New DBA Cockpit.”  With BW 7 comes the latest incarnation of screens to get at data we experienced DBAs go to our SQL script libraries for.  But at times, the ability to quickly spit out a table of history records is important, I’ll use DB02.

As a sidelight, I found and we needed to fix a post upgrade bug with DB02.

Error: 

ABAP program “SAPLS_ORA_COCKPIT_5” had to be terminated because it has come across a statement that unfortunately cannot be executed

Fix:

SAP Note 1147562 – DBA Cockpit -> Segments detailed analysis – DYNPRO_NOT_FOUND

We did not open a ticket – I got help on Twitter, and we found the SAP note fairly quickly with a simple search.

The screens below may illustrate what would drive a DBA up the wall when trying to locate basic information.

Screen 1

 

image image
image image

 

Screen 2

 

image image
image image

 

 

To fit these into the SCN blog image limits, I split my screen shot into 4 parts.  I normally work with 1024×768 resolution.  While my desktop monitor allows higher resolution, the laptop screen is limited; I leave it smaller to keep my icons and other desktop grimbles in consistent places.  It bugs me that software developers assume and insist that we all have huge screens.  I can’t replace this PC until its lifecycle is through.  End of story.

What I hope you observe on these images is the futility in seeing segment history in a meaningful way.  You probably can’t see all of the slide bars, since some of them are hidden and can’t be seen until you slide another bar and poof they appear. Total insanity of user interface quality!

The top screen shows current data about the segment; the lower shows history – 3 weeks of it.  To make this a lot clearer, I dragged the data into a spreadsheet and graphed it (there’s probably a graph button there but in all the years I’ve used SAP I’ve almost never found one of those helpful).

RSDDSTATEVDATA – Segment History By Day

image

 

 You can see that after our upgrade this object took off like a rocket, pausing slightly only on weekends until 15-Aug, when we began purging data.  This shows allocated, not used space, so the full picture of row falls and rises is not obvious.  Short version: the SQL statement and what the optimizer does has not changed.

 

Enough of DB02, let’s talk about ST05, ST04 and the other tuning transactions like SM51.  I’m going to skip over a “Oracle SQL cache analysis” recap here, and jump right to how I tune. 

I start by looking at SM66 on a frequent basis, to see what “long running transactions” are normally there.  If I hit a slow screen, I look at SM66 to find out what code is running.  In this case, I found that an RFC transaction is launched via another user (BW-ALE) to go get data.  Normal SQL traces showed nothing, as my user ID wasn’t being charged with the time, nor was the workload even on the same application server where I started.  We’ve seen a big jump in RFC calls from BW 3.5 to 7.

After I find the problem with SM66, I go to SM51, via SM50 that lets me go to the right application server.  The advantage SM51 has is that it shows the full table names, which SM66 and SM50 do not.  This is critical in modern systems!

SM51 showed me the ABAP program that was running, and how much CPU it had accumulated. 

I also looked at ST04 to find the active Oracle sessions.  That’s generally faster than going all the way to a SQL trace or digging through the cache.

The disadvantage that SAP help, and probably many of my blog readers have, is that I know what is right in our systems, and you don’t.

– “The width of those tables are very small” 

  Whatever gave you that idea?  This is a 3 table join.  Even if Oracle doesn’t use much space for each row, the application servers runtime object is a lot bigger.

– “SQL plan between when you run in SQLPLUS vs in st03n? ST04’s sql plan”

  Again. it’s not the plan that’s the problem.  Here’s why:

 

SELECT * FROM “RSDDSTAT_OLAP” WHERE “STARTTIME” > :A0
ORDER BY “SESSIONUID” , “STEPCNT” , “HANDLETP” , “HANDLEID”

 

We’re playing “Go Fish” and ABAP has told us to give it every card greater than a 2, or 3, or whatever.  Oracle decides “I will look through all the cards, as that’s faster than using an index.”  Hence, 2 full table scans.

As Bala said, that doesn’t take very long, really, even with 1 GB of data.  Where is the time being spent?

Uh, in the application code?

Let my DBAs go.

 

 (3) BW Technical Content

 

As I’ve said, this problem started after the BW7 upgrade.  So, it’s code or data related.  We didn’t change Oracle parameters or patch sets (well, maybe a couple minor ones – but they don’t matter – see above).

I used ST03 transaction workload analysis to take a peek at what the other analysts were doing one day.  I didn’t need to run the code a bunch of times to see it had issues, but someone did.

App server 1

System Action in Triggering System Number of Dialog Steps Total Response Time (s) Average response Time/Dialog Step (ms) Total CPU Time (s) Average CPU Time (ms) Total Database Time (s) Ø DB Time (ms)
 BW  ST03        2  7727  3863451.0  7417         3708665.0 297  148256.0
 BW  ST03N      1  7557  7556892.0  7292  7291890.0 241      241445.0

 

App server 2

System Action in Triggering System Number of Dialog Steps Total Response Time (s) Average response Time/Dialog Step (ms) Total CPU Time (s) Average CPU Time (ms) Total Database Time (s) Ø DB Time (ms)
 BW  ST03N      1  7891 7890650.0 4155   4155370.0  212  212261.0

 

App server 3

 

 

System Action in Triggering System Number of Dialog Steps Total Response Time (s) Average response Time/Dialog Step (ms) Total CPU Time (s) Average CPU Time (ms) Total Database Time (s) Ø DB Time (ms)
 BW  ST03        3  15274     5091222.7 12112 4037453.3 434  144650.7

Yes, we’re spending a few minutes in the database.   But many more on the application server, burning CPU cycles.

 

 (4) Data Prevention

 

Here, I’d like to plead that you visit the ASUG ILM team (Information Lifecycle Management), or the equivalent space on SCN, maybe service.sap.com/ilm and look for the Data Management Guide.  Prevent these data from accumulating, or purge them when they’re no longer needed.  We’ll all be happier.

 

Vers.  Date
 
 5.7  April 08, 2008
 5.8  Aug. 4, 2008
 5.9  Nov. 5, 2008
 6.0  June 30th, 2009

Data Management Guide (Version 6.0)

© SAP AG, 2009 Page 28

[I’m quoting from the book to illustrate an academic point, which should be fair use…]

Excerpt:

4.1.11 RSDDSTATAGGRDEF (Statistics Data OLAP: Navigation Step / Aggregate
Definition)

4.1.11.1 Prevention
Cannot be used. [Boo!]

4.1.11.2 Aggregation
Cannot be used.

4.1.11.3 Deletion
After implementing the correction from SAP Note 847388 (as of SAP BW 3.0), you can use program SDDK_STA_DEL_DATA to delete these entries. In addition, during the deletion process the system repeatedly performs a database commit, so that less memory is used.

4.1.11.4 Archiving
Cannot be used. [Just delete it]

4.1.12 RSMON* […]

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Bala Prabahar
    Jim,

    Thanks for this blog.

    I guess I follow a different approach to solve issues. The approach I normally take when we run into issues is that “I am not OK, everyone else is ok” – I modified the original phrase from the book “I’m OK, You’re OK, by Thomas A Harris”. What does this mean? Anytime we run into any issues, I blame myself(or my team) first and then troubleshoot to prove that I am wrong!

    I normally run more than once anytime we have issues(of course this depends on the compexity). SAP even recommends running more than once for SE30 analysis. First time, according to SAP developer(Courtesy: TechEd 2006, hands-on session), you run SE30 with Full Aggregation for all the statements, second time, identify the bottleneck and collect detailed stats(Aggregation none or Per Call Position) for the component with the issues. Additional runs may be required based on how complex the problem is.

    Yes, you did tell us: CL_RSTCT_BIRS_OLAP_AGGR_EX in your first blog. (The screenshot). This was  the reason why I suggested SE30 analysis. This class still doesn’t tell us which statement is having issues. SE30 would give the runtime at a low level(statement within that class).

    Yes, the view is a 3 table join. Don’t know why it matters.  This was irrelevant because the sizes of those tables were insignificant(not a big deal,as you stated).
    Not sure why the runtime object size(why it would be much larger than DB object is another question) matters because we are discussing how much Oracle needs to read to satisfy the request.  
    I rely more on ST05 for DB issues because SM51 output is bit misleading(or sm50, you can get table name even in sm50 by double clicking the work process. I believe what you see from sm51 is really sm50! because SM51 shows the list of SAP servers, Sm50 is the process overview). As we noticed, SM51(or SM50) was displaying the last statement. I notice this symptom in ST04-> active sessions as well.

    Thank you,
    Bala Prabahar

    (0) 
    1. Jim Spath Post author
      Bala:

        Interesting perspective.  If that works for you, great.  I prefer to find the culprit via the most direct route.  In this case, once I eliminated the SQL statement and the Oracle optimizer, it was either data or code.

        While you say the data size was insignificant, the tables grew from 0 to millions of rows shortly after our BW upgrade.  It was certainly feasible that data volumes were causing an issue.

        As to SE30, I have sometimes traced SAP code and showed them what is wrong with it, but I believe that SAP support should do this, not us.  This is their code.  When the logic generates SQL statements forcing the optimizer to do full table scans, that’s usually a clue that it was released with inadequate volume testing.

        Compare any database object against its runtime object.  Oracle uses UTF-8 to store data; strings are packed with VARCHAR2.  SAP application servers expand the data to UTF-16, and strings are full length.  The main point here is that the majority of time is spent on the application server.  Allocating memory for junk is a waste of time; bringing in data and not using it likewise.

        Oracle reading the data is one factor; shipping it to the app server is another, and what happens there yet another.

        Anyway, we applied the SAP note fix in our quality system, and guess what – the SQL statement generated is different, and the CPU time has magically shrunk.

        I may not blog the next phase here on SCN until the #blogtheft episode is over, and definitely not until the fix reaches production. 

      Jim

      (0) 

Leave a Reply