Skip to Content
Author's profile photo Jens Gleichmann

SYB: Update statistics – function datachange bug or feature?

Hi ASE specialists,

once again I’m facing an interesting issue on Sybase / SAP ASE.

May be you know how important it is to have updated statistics in your DB. The newer the statistics, the better for the execution plan (in normal case).

If you have configured your ATM (Automatic Table Maintenance) in dbacockpit correct, you should have good statistics if all is running fine (e.g. job scheduler).

Wrong or right?

SYB_GET_DATACHANGE

Let’s test it with function module SYB_GET_DATACHANGE. Here you can specify the table and get back the data change ratio (%) of the table and table partitions. You also can check this manually with the “datachange” function (select datachange(object_name, partition_name, colname)), but the ABAP part uses exactly the same.

May be it is a little bit more comfortable to execute it in ABAP 😉

So currently we have a data change ratio of 62% but no partitions have changes.

OK however, we will collect statistics. Therefore we use the “update all statistics” command. Why updated all statistics? To avoid missing statistic on partitions, indexes and columns.

description in sybase documentation:

“update all statistics updates all the statistics information for a given table. Adaptive Server keeps statistics about the distribution of pages within a table, and uses these statistics when considering whether or not to use a parallel scan in query processing on partitioned tables, and which index (es) to use in query processing. The optimization of your queries depends on the accuracy of the stored statistics.”


select datachange('SAPSR3./BIC/FZSDHUC03',NULL,NULL)
go
---------------------------
                   62.873411
(1 row affected)
update all statistics "SAPSR3./BIC/FZSDHUC03"
go
sp_flushstats
go
(return status = 0)
select datachange('SAPSR3./BIC/FZSDHUC03',NULL,NULL)
go
---------------------------
                   62.873411

sp_flushstats is used to flush the counters to disk. This will be also done by the housekeeper. It is not necessary to do this here manually.

You should never use datachange without specifing a column, because the result will be aggregated across all columns, but SAP is doing it (SYB_GET_DATACHANGE), so… 😉

Normally the data change ratio should be nearly 0. No changes happened to the table, so is it a bug?

To get more details you can use the tool optdiag.

Here the syntax:


optdiag statistics <SID>..<table> -Usapsa -Ppasswd -X -o output.out

The output showed me that not all columns were updated with the “update all statistics” command, but also after manually updating this columns the result of the datachange output is the same.

So it seems to be a bug. OSS message is created, but currently no answer to it. The used DB version was ASE 15.7 SP122.

##########

Update

##########

The development team released a note within 2 days with a correction on SYB_UPDATE_STATS:

2079837 – SYB: Avoid redundant statistics update due to CR 770415

It seems to be connected with the hashing in some releases, if we have a look into the coding:

“to avoid running into CR 770415, we need to enforce no hashing for partitioned tables on older releases

[…]

if partcnt > 1 and ( dbrel < ‘15.7.0.132‘ or ( dbrel+0(4) = ‘16.0’ and dbrel < ‘16.0.01.00‘ ) ).

[…]

Please implement this note when your AE release is lower than 15.7 SP132 and on ASE 16 lower than SP1.

This will solve this issue!

##########

Update End

##########

I hope could help you to understand the statistics.

If you have any further questions, don’t hestate to comment the blog or contact me or one of my colleagues at Q-Partners ( info_at_qpcm_dot_de )

Best Regards,

Jens Gleichmann

Technology Consultant at Q-Partners (www.qpcm.eu)

Details:

update all statistics

update index statistics

update statistics

update table statistics

optiag

datachange function

Assigned Tags

      3 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Hi Jens,

      The above is a very nice account of your good attempt to get acquainted with SAP ASE Opt Stats. I shall try to share with you my experience, if applicable, on those issues that you have raised. Since the questions are well embedded in your long post, please excuse me if I miss any of them.

      (1) On ATM/job scheduler

      I have had no experience with ATM so I cannot comment here. However, I have seen a recent (Sep 2014) ASE Change Request (CR) of the New Feature Request (NFR) type raised internally by SAP Engineering to create a framework of production quality features around "capture missing statistics" to automate UPDATE STATISICS to correct query performance issues arising from missing opt stats on non-indexed columns.  This is a nice NFR, which is overdue since ASE 15.0 GA release.  Once this is implemented, both SAP & non-SAP customers using SAP ASE would benefit.

      The above CR/NFR would suggest that ATM may not be entirely capable of fully automating updating opt stats based on what are being missed by the Optimizer when generating query plans.

      (2) On "update all statistics"

      This is not the magic bullet and could be an over-kill in attempting to improve on the Optimizer's ability to get more optimal query plans. For databases with huge tables with large numbers of columns & hundreds of millions of rows, it is not practical to run "update all statistics" on them.

      Here are some caveats:-

      A/ It may take too long to run the commands depending on number of columns & rows per table;
      B/ Statistics on columns, which are not involved in any queries' table joins or search predicates etc. are never used in the Optimization of query plans, therefore, do not add any values;
      C/ Even with statistics on table join columns or search predicate columns, there is no guarantee that the Optimizer would pick better query plans (it takes further efforts to 'persuade' the Optimizer in some occasions by increasing number of histogram steps for example).

      In my experience, it is not practical to determine what tables/columns to update stats based on datachange values alone.  It is more effective to enable 'capture missing statistics' at server wide for a very short duration e.g. 15 seconds at peak load period then establish what column histogram and/or column group density statistics are being missed by the Optimizer leading to sub-optimal query plans then target those tables/columns with missing stats to update their stats accordingly. 

      (3) On whether update stats should reset datachange values

      In your example of running "update all statistics", only the table name is supplied - this would update statistics of all the indices & columns of the table and should reset their datachange values to zero. However, I am not sure the datachange values of the partitions are affected as the command does not explicitly name any of the partitions.
       
      "
      Normally the data change ratio should be nearly 0. No changes happened to the table, so is it a bug?
      "

      I presume your above comment refers to an update stats command is expecting to reset datachanges to zero for those indices & columns within the scope of the command.

      (4) On your unfortunate experience as per your comment below, without checking the Case/Message & the details of the repro of the alleged 'bug', I cannot put forward any suggestion however I am aware of tables with huge number of partitions causing some bothers with ASE expected behaviour.

      "

      The output showed me that not all columns were updated with the "update all statistics" command, but also after manually updating this columns the result of the datachange output is the same. So it seems to be a bug. OSS message is created, but currently no answer to it. The used DB version was ASE 15.7 SP122.

      "

      Best regards,
      Raymond

      PS - I have had done some experiments on a table with 3 partitions of Round-Robin and equal distribution of 45,000 rows in each.  In my tests, I updated some rows in 1 of the 3 partitions then used "update statistics <tablename> partition <partitionname>" command after the update.  I saw positive results of:-
      + datachange value being reported as a lower average at the table level;
      + datachange value being reported accurately at column level;
      + datachange value being reported accurately at partition level;
      + after the update stats, datachange value being reset to zero at column level;
      + after the update stats, datachange value being reset to zero at partition level in some of the tests;
      + after the update stats, datachange value being reported as a lower values at partition level in other test.

      The SAP ASE version used in my test is ASE 15.7 SP61.

      Author's profile photo Jens Gleichmann
      Jens Gleichmann
      Blog Post Author

      Hi Raymond,

      thanks for your comment and your critism, but I think you didn't get the point of this blog! This was no critism to sybase itself! It was more the coding of the ABAP report which delivered wrong values. More the logic how the datachange function was used was wrong.

      (2) On "update all statistics"

      This is not the magic bullet and could be an over-kill in attempting to improve on the Optimizer's ability to get more optimal query plans. For databases with huge tables with large numbers of columns & hundreds of millions of rows, it is not practical to run "update all statistics" on them.

      => I have tested it not only with update all statistics and you are right, this way should be the last way to update the table stats. But I have tested it also with update index / update table / partition statistics but the result was the same! The update all statistics should just show that also this way solved not the result of the datachange function!

      3) On whether update stats should reset datachange values

      In your example of running "update all statistics", only the table name is supplied - this would update statistics of all the indices & columns of the table and should reset their datachange values to zero. However, I am not sure the datachange values of the partitions are affected as the command does not explicitly name any of the partitions.
       
      "
      Normally the data change ratio should be nearly 0. No changes happened to the table, so is it a bug?
      "

      I presume your above comment refers to an update stats command is expecting to reset datachanges to zero for those indices & columns within the scope of the command.

      => as already mentioned I have tried not only the update all statistics and the optdiag command showed me that all statistics are up-to-date. So why should the datachange function show a value higher than 50%? The screenshot is showing us that all partitions have change ratio of 0. Have you an exlanation for this?

      The sybase documentation of the update al statistics command includes this:

      "update all statistics updates all the statistics information for a given table."

      all the statistic information of a table which includes in my interpretation also the partitions.

      (4) On your unfortunate experience as per your comment below, without checking the Case/Message & the details of the repro of the alleged 'bug', I cannot put forward any suggestion however I am aware of tables with huge number of partitions causing some bothers with ASE expected behaviour.

      => this statement shows me that you are working for SAP/Sybase and you are completly right without checking all of the details you shouldn't try to give rate on this. So don't call it 'unfortunate experience', because the fact that SAP released a fix on the coding, showed that I wasn't so wrong with my transliteration as alleged bug, or?

      PS - I have had done some experiments on a table with 3 partitions of Round-Robin and equal distribution of 45,000 rows in each.  In my tests, I updated some rows in 1 of the 3 partitions then used "update statistics <tablename> partition <partitionname>" command after the update.

      => regarding your tests: Did you include hashing in your partitioned table? Your version should also include this bug as mentioned in the update of the blog.

      So common and accept that it was a bug regarding the ABAP coding 😉

      Bugs are normal and the development provided this fix pretty fast, so don't worry and take it sporty. 😛

      Regards,

      Jens

      Author's profile photo Former Member
      Former Member

      Hi Jens,

      I did not mean to criticise by replying to your post.  True, I did not understand your long post was a really a long build-up to point out that there is a bug in SAP ASE.  I was paying more attention to your "questions" as I thought you were genuinely looking for answers. My bad 😉 .

      I worked for Sybase as a Product Support Engineer, then, I moved to Sybase Education before becoming a SAP consultant. My specialist skills are in SAP ASE. When working at customer sites as a SAP consultant, I would ask customers not to use "update all statistics" as it is not a panacea (as I explained in my reply to your post) but focus on updating Optimiser stats for those tables/columns, which are involved in queries with sub-optimal query plans.

      Regarding the SAP Notes & more specifically, the SAP ASE CR behind it, the product defect being fixed involved the followings:-

      (1) update index statistics occasionally skips over processing minor index column;

      (2) as a result, datachange percentage at table level continues to show non-zero;

      (3) for those partitions, which do not involve the minor index column being missed, their datachange values are shown as zero;

      (4) for those partitions, which involve the minor index column being missed, their datachange values are shown as non-zero.

      As a minor note, running "update all statistics" command with the "partition" keyword selectively update local index statistics for the named partition. So partition statistics are really the statistics of their local indices.

      No, I did not include hash partitions in my tests.  FYI, semantic partitions and round-robin partitions are treated identically as far as Optimiser statistics is concerned.

      I shall test out the CR fix available in SAP ASE 15.7 SP132 & ASE 16.0 SPx in due course. Personally, I would like to see this CR fix being back-ported to ASE 15.7 SP0xx & SP1xxx releases as there are customers running the earlier releases.

      Best regards,

      Raymond