Skip to Content

Browsing through closed support messages while looking for a possible approach to a certain Oracle CBO problem I came across a quite problematic issue.

Some support colleagues sometimes seem to focus too much on removing the symptoms of a software defect rather than analysing it correctly and forwarding it to the development team so that they create a fix for it.
Of course there is reason behind this kind of message handling: the way the performance of support consultants is evaluated motivates them to focus on message closings.

Unfortunately a closed support message does not equal a solved problem, as we all know.
In the second best case (say, solving the real problem is the best possible message outcome), the support consultant will be able to present a fix for the issue.
Something that makes the effects of the problem disappear or make them less critical.

For quite a number of Oracle performance problems, where the CBO decided for a wrong execution plan, such a fix would be to manipulate the optimizer statistics.
SAP note #724545 explains how this can technically be achieved and it’s pretty simple once you’ve done it a few times.

Provide a fix – not that easy…

However, the critical point here is not to know how to perform this manipulation but to know when to use this tool and to know which of the optimizer statistic values needs exactly what tweak to lead to the desired result.

One pretty common kind of manipulation is to change the number of distinct values for a certain table column or to reduce/increase the clustering factor of an index.

To be able to figure out what exact value is causing the optimizer to choose the wrong plan is on the other hand pretty difficult.
Of course there is some documentation available about the calculations the optimizer performs (roughly) and one may also employ the quite readable optimizer trace (event 10053).
Still, to understand the reasons for the optimizer decisions keeps being complex, especially once it is about a multi-table-view-join construct.

Provide a fix that is actually usable – pretty difficult

Now we’ve this toolset to change the optimizers input (the statistics) and we’ve a not so good understanding of how the optimizer works with this input.
Of course the support consultant will likely try a out a bit until the execution plan for the statement in question looks fine.

The message is then usually returned to the customer with some notes on what was done and that now it should be checked whether the response time of the report has improved.

The big mistake.

Let’s assume that the modifications actually did what they were supposed to.
The next thing that happens is: the customer confirms the message.
What’s wrong with this?

The problem is: now a symptom of a specific complex problem, maybe a product error (a bug) in Oracle, has been removed. Note: just the symptom.
The error is still in the system and nobody is looking after it to think about a solution.

Even worse, statistics manipulation are very difficult to maintain in the operations of multiple systems or even service providers.

Until lately there wasn’t a standard approach to keep and document these statistic manipulation.
Recently the DBSTATC table usage has been ‘enhanced’ to handle such settings as well (check note #892296 for details on that).

Anyhow, for the user it will always bring much additional effort to keep manipulated statistics.
The changes need to be documented, eventually re-applied and re-evaluated with the next patches that get installed in the system. And all the time unexpected side-effects could occur (e.g. other statements that used to work OK now run too slow).

For the support it means that a problem that already happened to occur some times wasn’t analysed correctly yet. That means more work and less time to focus on new issues.

Don’t go for quick fixes

Therefore, as a customer you shouldn’t confirm the problem just because the symptom is gone.
Ask for a solution instead.

As a support consultant you should remember that a fix just isn’t a solution and shouldn’t be presented as such.

If there is a good reason for the statistics manipulation (e.g. extremely volatile data volumes so that it’s hard to gather representative statistics), then SAP should and will provide a note with these manipulations.

If the reason for the manipulation is not so clear, then it’s very likely that you’re facing a bug here that needs to be identified and solved.
In many cases it may be sufficient to increase the sample size for the statistic gathering or to install the latest Oracle patches.

Anyway, it has to be clear which bug you hit and how the fix works and of course how the bug will eventually be really fixed.

For those interested in the technical side of these topics, you may want to check the following notes:

892296 Enhancements in update statistics in BRCONNECT 7.00/7.10
106047 DB21: Customizing the DBSTATC
724545 Adjusting the CBO statistics manually using DBMS_STATS
1020260 Delivering Oracle statistics
176754 Problems with CBO and RBO
766349 FAQ: Oracle SQL optimization
588668 FAQ: Database statistics
756335 Statistics in tables w/ heavily fluctuating volumes of data
122718 CBO: Tables with special treatment

Book and blog/website of Jonathan Lewis: http://jonathanlewis.wordpress.com/

To report this post you need to login first.

7 Comments

You must be Logged on to comment or reply to a post.

  1. Matthew Billingham
    I occasionally do support work, so I agree with you whole-heartedly.  One of the problems I encountered with my development team was programming by coincidence.  For example, a year variable was unexpectedly 1 less than it “should” have been.  The programmer fixed this, by adding one to the variable when the offending method was run.  Now the program works.  Or rather it does what it’s expected.

    But he hasn’t solved it.  He’s just fixed it.  He doesn’t know WHY the variable had a lower value than expect.

    A week later, another problem comes in for the same application.  In another part of the program, the year variable is one year later than expected.  So, the programmer subtracts 1 from the year variable.

    A week later, the original problem is back!  The fix was adding/subtracting 1.  The solution was to create two variables – one for “this” year and one for “last” year.

    (0) 
    1. Lars Breddemann Post author
      Hi Matthew,
      thanks for sharing this.
      I’m pretty sure that most developers/supporters will know some stories like this one.

      For SAP systems however, I think this should never happen. They are just to important to most customers business to allow such quick-and-dirty fixes and leave the original problems untouched.

      As this is just about techical problems, we can be sure that those bugs are in fact solveable. There’s no need to accept a workaround instead.

      regards,
      Lars

      (0) 
  2. Bert Ernie
    I like your blogs, I remember the ‘How to “win” in SAP customer messages’.

    I fully agree to find a real solution for such problems, but an easy workaround is often key for everybody, it just takes the pressure off.

    Besides that thanks for your insights in how the SAP support actually works. In my opinion this is very valuable knowledge to minimize frustration and speed up problem solving. Keep it up!

    Cheers Michael

    (0) 
    1. Lars Breddemann Post author
      Hi Michael,

      glad you like my Blogs – I guess the range of topics is rather ‘special’, thus the audience is usually quite limited 🙂

      One thing to stress here is: OF COURSE the support processes are not MEANT to lead to the described problems.
      Nobody plans to leave bugs that customers hit unfixed.
      However, I think that it’s basically not possible to fix this by changing processes. It’s necessary to change attitudes for that – both the attitude of the supporter and the attitude of the customer.

      regards,
      Lars

      (0) 
    2. Matthew Billingham
      That’s the whole point. An easy workaround takes the pressure off, sure.  But in the LONG RUN, it costs more, because it doesn’t address the underlying issue. 

      When you’ve found the workaround, don’t stop – get the solution!

      As a developer I get the same issue in project implementations.  “If it works, do it”.  But what happens then, is that you’re shoving project costs into support costs.  Great for everyone on the project but expensive for the company as a whole.

      Fixing bugs during testing is much much cheaper than fixing them when everything has gone live.

      If only we could get away from the short-term mindset.  But with yearly bonuses, I guess it’s unlikely to happen.

      (0) 
  3. Rui Pedro Dantas
    Hi. Nice blog!

    I have one doubt: if you do force column statistics (e.g. DISTCNT), is there any way to configure that statistics for the table should be calculated, but not for this column in particular?

    I understand that the goal of your blog was quite the opposite (to avoid these kind of fixes), but if this configuration was possible many of the disadvantages would disappear (they are often related with statistics for other fields getting obsolete).

    (0) 
    1. Lars Breddemann Post author
      Hi Rui,

      on Oracle level alone – no there’s just the all or nothing approach.
      For SAP on Oracle there is a solution for that presented in note  #1374807.

      Basically brconnect had been enabled to gather current stats for tables with fixe statistics and re-apply the statistic modification afterwards.

      Anyhow the core point of my blog is not that such statistics manipulation is evil in any case.
      But using it for a final solution is a really bad decision.

      regards,
      Lars

      (0) 

Leave a Reply