Operator? Support? Hello? No, I don't want to ta...

JimSpath · ‎08-14-2009

The Wind Up

After our BW 7 upgrade, we looked pretty good once a backlog of process chains cleared out, and the tourist trade moved on from "hey look at the shiny new warehouse". I was able to view my stock performance reports, and other than new wrinkles with the Java stack, everything looked rosy. Except that within a few days, I could no longer view process chains in ST03/Workload Analysis. I dug deeper.

Development and Quality had been okay, as far as drilling into process chains as I have done for a few years. There was one glitch with the DBA cockpit that caused an ABAP dump when looking at specific menu tabs in storage history, but that was cleared up with help from Twitter and service.sap.com. The problem with ST03 was that I was getting an hourglass, a timeout, a hang. I impatiently killed the transaction and went to my colleagues.

Several reported "no problem" with ST03 initially, which led me to look at desktop settings, GUI, etc. Then, normal problem isolation techniques found that the problem happened for my user id on other systems. It was me, or something I did that others didn't. Finally a BW architect hit the same snag, trying to look at process chains.

The Play By Play

Once the problem was isolated to transaction steps and reproducible, I began looking for the long running code. Our initial SQL trace attempts were inconclusive, as the trace showed nothing running on my account, yet I continued to have long runtimes. I looked at SM66 and found processes that matched the approximate start time of my report, but they were on a different application server, and not under my ID. My data request had been transformed into an RFC call. Outsourced, if you will.

Time to open a ticket. And wait. I knew this was going to be a tricky one as it had taken us a little while to find it. I'll fast forward over the first several non-responses to get to the gems.

Your database parameters are incorrect.

Well, fine. I bet you read that in a book. Called "Early Watch Report". Support has access to reports run by analysts who have been working for SAP less time than we've been running SAP. Our DBAs came back with point-by-point explanations of why we have the settings that we have. As we've been trained by Oracle Center of Excellence world-class DBAs, we're confident that isn't the issue here.

In fact, our lead DBA looked at the statement originally and said even though there are indexes on the tables in the view, there is too much data.

Your buffer parameters are incorrect

Second try. Another "cook book" answer to a problem. An irrelevant solution to a specific problem. Table buffer sizes have nothing to do with this code. There are millions of rows in the RSDDSTAT tables now, so buffering them would be a no-win.

The Double Header

By now, I'm getting to be impatient as support is waltzing around different parts of the dance floor, getting nowhere near a fix to the problem. In the meantime, I'm working an entirely different technology issue with a different vendor, a problem where the time in Mexico City is different than in the U.S. Users there say the time is the same as our Central time zone, but their daylight savings times rules are different. I believe the users, of course, and find evidence online in one of the timezone sites I visit (the other is ambivalent unfortunately). However, the code in the OS and in the scheduling application doesn't work.

The timezone call took more than 40 minutes, where I was forced to listen to technicians doing web searches, and read fixes out loud that we had viewed and discarded immediately as inapplicable weeks before. When that call ended with a "we'll have to get back to you" I wasn't in the best of moods. Then, the latest SAP ticket update with "unable to reproduce" led me to volunteer to report my findings in the case log. Here we are with Mr. Vidal (pronounced "Veedle" - by Lily Tomlin), or "Operator" (shtick from the 1960s by Bob Newhart). Hello? Yes, we have a problem on the line.

The Box Scores

Here's the code I saw in trying to reproduce the issue for support to look at.

SQL Plan [15:37]

Parse Timestamp: 20090813 15:37:30

System: BW
SQL_ID  8fzky9kv03m4c, child number 0
-------------------------------------
SELECT * FROM "RSDDSTAT_OLAP" WHERE "STARTTIME" > :A0 ORDER BY "SESSIONUID" , "STEPCNT" ,
"HANDLETP" , "HANDLEID"

Plan hash value: 3919501385

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |       |       |       | 54732 (100)|          |
|   1 |  SORT ORDER BY                 |                  |   677K|   139M|   311M| 54732   (2)| 00:10:35 |
|*  2 |   HASH JOIN                    |                  |   677K|   139M|   112M| 28079   (3)| 00:05:26 |
|*  3 |    HASH JOIN                   |                  |   677K|   104M|       | 10103   (3)| 00:01:58 |
|   4 |     TABLE ACCESS BY INDEX ROWID| RSDDSTATINFO     | 14760 |  1109K|       |   515   (1)| 00:00:06 |
|*  5 |      INDEX RANGE SCAN          | RSDDSTATINFO~010 |  2657 |       |       |    30   (0)| 00:00:01 |
|   6 |     TABLE ACCESS FULL          | RSDDSTATHEADER   |  5120K|   415M|       |  9542   (2)| 00:01:51 |
|   7 |    TABLE ACCESS FULL           | RSDDSTATEVDATA   |  7633K|   393M|       |  9425   (3)| 00:01:50 |
-----------------------------------------------------------------------------------------------------------

Sharp DBA types will notice that RSDDSTAT_OLAP is a 3 table join view, and that 2 of the tables are huge, and that Oracle says "do a full table scan" on both of them. Ugly code. But, there it is.

A few minutes after I grabbed this SQL statement (and pasted it into the ticket) I viewed SM51, to show the running dialog process I was waiting for.

Picture 1 [15:53]

Picture 2 [16:00] - more than 20 minutes after I grabbed the SQL plan.

TO BE CONTINUED!

Operator? Support? Hello? No, I don't want to talk to Mr. Veedle.

Are you there, SAP? It's me, Jelena

Integration Point of MM-FI-SD in SAP ERP

SAP Project System - A ready Reference ( Part 1 )