Skip to Content

We are in the midst of a lengthy project of migrating a large number of ASEs to 16.0.1.2 (stopped on 16.0.1.2 since we bumped into a strange bug with misbehaving unions with sort by column number which no one could reproduce/explain).

Migrating one of our ASEs we’ve got an input from the field that a conversion proc runs much slower on the new server than on the old server.  Well, with the regular 80/20 approach  it sounded like a thing to be expected.  Not everything converts smoothly – some DBA activity is still required for the final finishes.

I’ve started to investigate – actually was extremely lucky to have both versions of ASEs and the possibility to rerun the code myself.  DBA’s paradise.  To be extra precise I decided not to stop on the regular showplan, plancost, time/io statistics but to delve into full query metrics and procedure memory footprint on each ASE.  Pretty fast I’ve got stuck:  based on all metrics ASE 16 executes the procedure faster than ASE 15.  sysquerymetrics for all statements has better metrics for ASE 16.  Plans generated by optimizer are either completely identical or better based on estimated IO.  Field execution time – ASE 16 runs twice slower.  Hm.

Lovely stuff.  In addition I’ve noticed another pretty nasty surprise:  tracing MemUsageKB in monCachedProcedures for the proc while it is executing I’ve observed that for ASE 15 the footprint rests somewhere below 1 MB, while for ASE 16 it grows continuously – up and beyond the size on the procedure cache.  If you are lucky – the proc execution eventually ends freeing up the memory back into PC pool.  If you are not (e.g. if you start “debugging” the proc execution adding time to write out various statistics) you eventually get the notorious 701 error and the proc memory is freed.

I do recollect from memory that sometime the same optimizer generates plans weighed differently for the same proc.  I faced this unexplained situation in ASE 15.7:  force optimizer to regenerate the plan (sp_recompile….) and the thing passes.  But I did not recollect having a proc MemUsageKB growing constantly while the proc is active (and indeed even on the same ASE 16 server the same proc sometimes behaves normally – generates a plan, uses is with the MemUsageKB staying fixed on around the same value as in ASE 15).

The proc has a cursor that moves 150K rows from one database to another for 3 large tables.  Nothing too complicated.  Tested with insensitive / semi-sensitive cursors – same behaviour.  Tested with cursor declared on select directly with 100 fields and cursor on temp table with 1 field – same thing.  I’m really puzzled where my PC memory is getting wasted.

Tried to dig into MDAs:  It’s possible, in theory, to see how much memory proc is using through monSysStatement:  grab the proc PlanID, trace the proc statistics per-line, per-statement.  You should be able to see which proc lines are heavy – on cpu/io/memory level (btw, there is a KBA & CR to fix wrong LIO reporting for insensitive cursors for 16.0.1.2 et al – but this is probably irrelevant to my case).  For my cursor I may see that each loop consumes around 600KB memory.  At the same time, monCachedProcedures report an increase of around 60K after each loop (and throughout the loop each statement has various memory footprint by itself).  Can it be that instead of releasing memory it keeps a portion of it for as long as the plan is active?

Based on monProcedureCacheMemoryUsage the Allocation which constantly grows alongside monCacheProcedures/monProcedureCache is MEMC_SCOMPILE_1 (has constant increase in Active and occasional Reuses).  Same with monProcedureCacheModuleUsage – ProceduralObjects.Active keeps growing, with occasional NumPagesReuse.  Active pages are approximately the same as the MemUsageKB reported by monCachedProcedures.

Anyone has any idea what may be causing this leak?  Anyone experienced the same behaviour?

I’m quite concerned about this.  May jeopardise the upgrade project altogether.

Insights will be appreciated.

Cheers,

Andrew

To report this post you need to login first.

18 Comments

You must be Logged on to comment or reply to a post.

  1. Mike Willett

    Yep

    We’ve had many issues with Procedure cache and high MEMC_SCOMPILE_1

    master..monProcedureCacheModuleUsage doesn’t match master..monProcedureCacheMemoryUsage

    [Simon Ogden has great advice on this]

    In the end we worked around the issue by

    – only allocating consistent size memory blocks for the procedure cache to remove fragmentatioon (not sure why allocating different size block from an area of memory would be a good idea) – trace flag 753

    – allocating a huge amount of memory to procedure cache (on moving to 15.7 SP1xx – we tripled our procedure cache) and increasing engine local cache.

    – reboot sybase server every week or dbcc freeproccache

    Also see

    Get Rid of the Crutches – Right Size Proc Cache

    We

    (0) 
    1. Andrew Melkonyan Post author

      Thanks Mike,

      753 is already in place.  Proc cache is pretty large (6 GB).  In this case cleaning up proc cache would probably not help as it is kept by an active proc and freed up as soon as the proc ends executing. 

      What I found puzzling is that after stripping all the code out of the proc – any select done inside a cursor’s space adds up to the memusage total for the proc in proc cache.  Which is to say, each iteration of the select executed within a cursor eats up an additional chunk of proc cache.  The larger the cursor’s result set and more statements with the cursor’s space, the more memory is consumed (which may server as an indirect reply to the “right size proc cache” idea). 

      I will continue to investigate this.  Sound pretty bizarre to me.

      (0) 
      1. Mike Willett

        > Proc cache is pretty large (6 GB).


        Is 6Gb large ?

        We used to have 5Gb of procedure cache, but since moving to 15.7 we now have 16Gb of procedure cache.


        Interesting about the use of cursors – I’ll try and see if this affects. We felt it was across all procs

        that were run heavily in parallel but maybe I need to do more analysis – although I’ve spent too much time on this already.

        (0) 
  2. Andrew Melkonyan Post author

    Got a simple reproduction for you:

    create a procedure:

    drop proc prc_memusage_check

    go

    create proc prc_memusage_check

    as

    begin

      declare @id int, @cnt int, @rcnt int, @dt datetime, @mu int, @es int, @plans smallint

      select @dt = getdate()

      declare big_cr cursor for

      select id from sysprocedures

      open big_cr

      fetch big_cr into @id

      select @rcnt = 0

      while @@sqlstatus = 0

      begin

        select @rcnt = @rcnt + 1

        select @cnt = count(*) from sysobjects where id = @id

        select @cnt = count(*) from syscomments where id = @id

        select @cnt = count(*) from syscolumns where id = @id

        select @cnt = count(*) from systabstats where id = @id

        select @cnt = count(*) from sysindexes where id = @id

        fetch big_cr into @id

      end

      close big_cr

      deallocate cursor big_cr

      select @mu = sum(MemUsageKB), @plans = count(*) from master..monCachedProcedures where ObjectName = ‘prc_memusage_check’

      select @es = datediff(ss,@dt,getdate())

      print ‘ASE %1!:  MemUsage = %2!, NPlans = %3!, ExecTime(s) = %4!, Rows Processed = %5!’, @@version_number, @mu, @plans, @es, @rcnt

      return 0

    end

    go

    Run it on ASE 16.0.1.2 (or other?)

    ASE 16000:  MemUsage = 36640, NPlans = 1, ExecTime(s) = 24, Rows Processed = 22736

    Run it on ASE 15.0.3 (or other?)

    ASE 15030:  MemUsage = 76, NPlans = 1, ExecTime(s) = 4, Rows Processed = 26690

    You may see plainly that memory consumed is way higher + execution time is way higher.  Pretty stupid code.

    Something in ASE 16 does not process the procs well.

    Those of you that have access to 16 (15.7?) and earlier ASE versions – could you test in your environment and post replies?

    Thanks

    ps.  Look at the difference with large cursors:  that’s completely insane!

    ASE 16000:  MemUsage = 2137968, NPlans = 1, ExecTime(s) = 3000, Rows Processed = 1330791

    ASE 15030:  MemUsage = 76, NPlans = 1, ExecTime(s) = 121, Rows Processed = 1087923

    I’m keen to know where this comes from!

    pps.  same behaviour on ASE 16.0.1.3 & ASE 16.0.2.2…

    (0) 
    1. Mike Willett

      Here’s my results (run twice just in case)

      Sybase 16 is running on a slower machine

      15.7/EBF 24647 SMP SP134

      ASE 15700:  MemUsage = 164, NPlans = 1, ExecTime(s) = 16, Rows Processed = 135188

      ASE 15700:  MemUsage = 164, NPlans = 1, ExecTime(s) = 15, Rows Processed = 135188

      16.0 SP02 PL03 HF1/EBF 26230 SMP ONEOFF

      ASE 16000:  MemUsage = 198, NPlans = 1, ExecTime(s) = 184, Rows Processed = 124729

      ASE 16000:  MemUsage = 198, NPlans = 1, ExecTime(s) = 182, Rows Processed = 124729

      Don’t get the high memory usage you get.

      Although we not aiming to go live with this verison – we’re waiting for the next version to come out with more bug fixes.

      (0) 
      1. Andrew Melkonyan Post author

        Thanks Mike.  Is it Solaris ASE?  My ASEs run on Solaris 10 (still).

        Curious, however, in your case ASE 16 is 10 times slower – although with less rows getting processed. 

        I’ll test tomorrow on Windows ASE + 16.0.2.3 on Solaris.

        Thanks for the effort of testing.

        (0) 
        1. Mike Willett

          All on Linux

          Our Sybase 15.7 is on fast hardware – we don’t normally see 10x slower

          We usually see it 3x slower.

          I hope its not Sybase 16 which is slower – but we’d have to test it.

          (0) 
    2. Kevin Sherlock

      I’m on Ubuntu Linux x86_64.  I don’t see the issue as you state it other than a modest increase in memory for 16.x:

      ASE Adaptive Server Enterprise/15.0.3/EBF 19955 ESD#4.1/P/x86_64/Enterprise Linux/ase1503/2782/64-bit/FBO/Tue Jun 12 02:05:16 2012:

      MemUsage = 126, NPlans = 1, ExecTime(s) = 20, Rows Processed = 261375

      ASE Adaptive Server Enterprise/15.7/EBF 21708 SMP SP110 /P/x86_64/Enterprise Linux/ase157sp11x/3546/64-bit/FBO/Fri Nov  8 05:39:38

      2013:  MemUsage = 126, NPlans = 1, ExecTime(s) = 40, Rows Processed = 261375

      ASE Adaptive Server Enterprise/16.0 SP02 PL01/EBF 25184 SMP/P/x86_64/Enterprise Linux/ase160sp02pl00/2424/64-bit/FBO/Thu Sep 17

      22:46:42 2015:  MemUsage = 166, NPlans = 1, ExecTime(s) = 41, Rows Processed = 261375

      Maybe we need to compare config files, traceflags, etc.

      (0) 
      1. Simon Ogden

        I’d suggest checking whether it might relate to streamlined dynamic SQL, one of the features of that is the grabbing of lava execution contexts at compile time, this effects any SQL, not just statement cache. It might be something in that area going awry maybe.

        (0) 
        1. Kevin Sherlock

          fair enough.  I enabled “streamlined dynamic SQL”, rebooted, and:

          ASE Adaptive Server Enterprise/16.0 SP02 PL01/EBF 25184 SMP/P/x86_64/Enterprise Linux/ase160sp02pl00/2424/64-bit/FBO/Thu Sep 17

          22:46:42 2015:  MemUsage = 182, NPlans = 1, ExecTime(s) = 36, Rows Processed = 261375

          (0) 
    3. Avinash Kothare

      AIX : Got results similar to Kevin and Mike

      ASE 15.5

      ASE 15500:  MemUsage = 110, NPlans = 1, ExecTime(s) = 13, Rows Processed = 145321

      ASE 15500:  MemUsage = 110, NPlans = 1, ExecTime(s) = 12, Rows Processed = 145321

      ASE 16.0

      ASE 16000:  MemUsage = 166, NPlans = 1, ExecTime(s) = 16, Rows Processed = 104218

      ASE 16000:  MemUsage = 166, NPlans = 1, ExecTime(s) = 15, Rows Processed = 104218



      Avinash

      (0) 
  3. Andrew Melkonyan Post author

    Thanks all for testing. 

    So far only my Solaris boxes exhibit this phenomenon (no streamlined or statement cache for me either).  My ASE config is not something special + the 15.0.3 – 16.0.x tests done here were performed on ASEs “identically” configured on server config level (+/- since there are features not available in the earlier version).

    Unfortunately I do not have large ASE installation on a platform other than Solaris to test.

    I am installing the latest Solaris version to verify & then will move to strip cfg to basics (memory size configs) and retest.

    There is a P1 case open on this – let’s see what we get from that corner.

    Thanks again for contributing to the investigation.

    (0) 
  4. Andrew Melkonyan Post author

    The culprit found.

    What causes this behaviour is – turning auditing on.

    Could you possibly test it on your releases?

    …so you were right on something in config driving it – might have spared me time testing 🙂

    …on the positive side – I can’t be advised to upgrade to the latest release for fix – it’s not there….

    -> auditing should be turned on

    -> login used to execute the proc should have turned auditing on for all activity it does

    (0) 
    1. Simon Ogden

      So is there a bug here? Or is this still just an investigation that might lead to a conclusion that all is well?

      If there is a bug related to excessive (unnecessary) proc cache usage then please let us know what audit option causes it, what version(s) it effects and whether it can be avoided.

      (0) 
      1. Andrew Melkonyan Post author

        There is a bug.  It affects all of the ASE 16.0.x.x releases.  The only option so far is to turn off auditing for the login executing the cursor.  I did not investigate further which particular auditing option triggered this – the login is configured to audit all activity.  I could test only the Solaris code line.  Not sure about other platforms.

        (0) 
  5. Andrew Melkonyan Post author

    Hey ho.  we’ve got our response from TSE and guess what:  this is SAID to be expected behaviour.   ASE 15.7/16 will gobble all proc cache however large it is if there is a procedure executing a cursor and someone has been foolish enough to audit this procedure activity. 

    Hm.  I don;t know about you – to me this sounds – if not a bug, something that the engineering has completely missed out. 

    1. 15.0.x – somehow auditing same procedure does not blow up procedure cache (however inane the request to audit procedures that run cursors is).

    2. 15.7/16.x – auditing of simplistic procedure is EXPECTED to eat up gigabytes of procedure cache.  Remedy?  Increase the proc cache or turn off auditing.

    Seriously?  I’d whip someone that claims this behaviour to be “expected.”  Expected by whom?  You have a simple condition to bring down ASE and it is expected??? 

    Pinch me for it seems to me I’m dreaming.

    Anyone can nag TS to reconsider this and FIX IT (lodged under case 191421/2016).

    Thanks.

    (0) 
  6. Andrew Melkonyan Post author

    There appears to be a workaround for this PC thrashing:  if the auditing for a login is set using sp_audit cmdtext, {login}, ‘all’,’on’ rather than sp_audit ‘all’,{login},’all’,’on’ the memory usage by the proc stays fixed on low KBs rather than growing into GBs (thanks to Jeff for suggesting to test it).

    (0) 

Leave a Reply