Skip to Content

Don’t get me wrong – crutches are useful things.  But some of places still are hobbling around on crutches long after they should not be. What crutches am I referring to??

  • Dbcc proc_cache(free_unused)
  • Trace flag 753 (and 757)

Now, before some well meaning but misguided person assaults me for being anti-crutches – I, of course, recognize the need for them…at a point in time.  But when you break a leg, the point is you go to the doctor, get it fixed and then only use the crutches until it heals.  What you don’t do after breaking your leg is to simply dig out a pair of crutches and use them for the rest of your life without ever getting your leg fixed.  That just isn’t natural.

And yet there are places that run with the above in place….for years.  Years of pain unnecessarily as both the above point to proc cache being undersized.

So how do we correctly size it??

First, you have to understand the biggest consumers.  Don’t jump in with “stored procedures (and triggers)” right yet ….for example, most SAP Business Suite systems run with multiple gigabytes of proc cache and there isn’t a stored procedure to be found.  If you look in monProcedureCacheModuleUsage, you will see there are nearly 30 different allocators of procedure cache, including:

  • Parser
  • Utilities
  • Diagnostics
  • Optimizer
  • Execution
  • Access
  • Backup
  • Recovery
  • Replication
  • Procedural Objects
  • Sort
  • HK GC
  • HK Chores
  • BLOB Management
  • Partition Conditions
  • Pdes Local HashTab
  • Statement Cache
  • CIS
  • Frame Management
  • AuxBuf Management
  • Network
  • Procmem Control
  • Data change
  • Dynamic SQL
  • Cluster Threshold Manager

Of these, the top most common consumers typically are:

  • Optimizer
  • Execution
  • Procedural Objects
  • Statement Cache
  • Dynamic SQL

With a few others sometimes contributing, such as

  • Parser
  • Utilities
  • Access
  • Sort
  • Procmem Control

The two culprits that I think contribute the most to customers relying on the above crutches is grossly underestimating the proc cache requirements of the Optimizer+Execution and Statement Cache.  The latter, is often directly due, I think, to the former – so let’s spend some time there first.

Every query – whether a simple “select * from table” to a complex proc – uses procedure cache. Every query.  Get used to it.  It starts when the query is first received and the SQL is parsed.  Thankfully, this tends to be not that much – just a few pages of proc cache.  Then we get to optimization….and OUCH!!  This is a heavy consumer of procedure cache.  The typical query optimization plan can run from a few low 10’s of KB for a simplistic “select * from table” to multiple 10’s of MB.  This would also be true for simple inserts, updates or deletes using literal values.  However, once joins start, this quickly jumps to ~200KB.  More complicated queries such as large joins/nested subqueries can easily consume a few MB of proc cache.

But wait.  Remember how we got that plan.  First the optimizer had to develop a number of “work plans” as possible alternatives.  Each of these as developed are kept in memory until the final costing. The number of work plans depends upon the number of tables involved as well as the number of indexes and predicates in the where clause – but it easily can be 10-20.  Consequently, a single query optimization can use 2-4MB of procedure cache during optimization – and then drop down to 200KB once the final plan is chosen.

But then we are not done yet.  Those familiar with 16sp02’s compiled query feature understand that once a query optimization plan is chosen, it is not the end.  Nope.  The query execution engine needs to develop an execution plan on how it is going to execute the physical operators in the query plan.  For example, if a hash-based grouping is chosen, it needs to create the hash buckets (the number of which depends on the estimation), etc.  Net result is likely that 200KB query plan needs another 200KB of proc cache for the execution plan.  This is why if you pay much attention to monProcedureCacheModuleUsage that Optimizer and Execution are frequently consuming proc cache.

That is just for a basic query with a 2-3 way join and a few predicates.

Now, take that 6000 line stored procedure, which, likely contains a few hundred of these. Yep.  Now we are talking 10’s if not 100MB of proc cache.  But usually, most well developed procs need about 2MB.  And remember, that is just for one copy.  With each concurrent execution, yet another copy of the plan is loaded.  Of course, remember – what we have loaded is just the procedure optimization plan – we still need an execution plan for each statement.

How can we use this for sizing?  The answer is simple – how many concurrent queries is your system experiencing on a regular basis?  A good estimation if you are not sure is to simply figure 10x the number of engines you are running.  So, on a 30 engine system, it is likely there are 300 queries active at any point in time. If we figure that each concurrent query likely needs about 2MB of proc cache as a very loose estimate – then we are saying we need about 600MB of proc cache just in query optimization and execution – which is not unlikely.  We also know that it will take probably 500MB of proc cache to hold 250 normal stored procs. Each statement in statement cache will need another ~100KB for just the statement – and each plan for that statement will need memory as well…..  Ohhhh…..the math…it just hurts the head.

The better method is to simply think about the number of concurrent active sessions.  Now, if you are the type of shop that runs with 10000 connections with most of those idle at any point in time, you might want to start monitoring monProcess periodically and then run a query against the collected data similar to:

select SampleTime, count(*)

  from monProcess_hist

  where WaitEventID!=250

group by SampleTime

order by 2 desc

For grins, let’s say the answer is 1000.  Now we need to know how many are running stored procs on average vs. executing queries. All we need to do is to add a where clause of LineNumber>1 to the above query (yes – a batch can have more than one line….but then batch/stored proc – nearly same thing when it comes to proc cache for optimization).  Again, let’s say the answer is 100 (nice 10%).

Now that we know our concurrency, we have to figure how much proc cache per user.  Remember, in addition to currently executing queries, we need proc cache for statements in statement cache (in addition to the statement cache itself) and we need to keep some of our most frequently executed procs in cache.  The latter – if we assume our 10% is hitting – may already be counted, but let’s use the following estimates:

  • Each query needs 200KB of proc cache for optimizion – plus same for execution
  • Each user will cache ~15 statements (between statement cache or Dynamic SQL – take your pick).  Each statement needs 200KB for the plan.
  • Each proc will need 2MB of proc cache – and we want to keep ~250 or so in cache (including all the concurrent plans)

We then can compute:

  • 900 queries * 200KB per query * 2 (opt + exec) = 360MB for queries
  • 100 procs * 2MB per proc * 2 (opt + exec) = 400MB for proc based on concurrency
  • 250 procs in cache * 2MB per proc = 500 MB for proc cache (bigger – we will use this number)
  • 1000 users * 15 statements/user * 200KB/statement=3GB

…..so, with just some back of the envelope planning, we can see that we would need >4GB of total proc cache.  Let’s separate out statement cache – each 1GB of statement cache can hold 15000 statements (without the plans), so really we are talking 1GB of statement cache and 3.5GB of proc cache (for the procs, the queries and the statement cache plans).

At this point, mistakes are made.  The typical DBA might set proc cache at 4GB and statement cache at 1GB and think the job is done.  And then complains to support about proc cache spinlocks …..and support gives them the crutch of dbcc proc_cache(free_unused) and the problem goes away…not really.  They have to run it every 30 minutes or whatever….so they are treating the symptom but not the problem.  The real issue is that our MINIMUM requirements for procedure cache is 4.5GB. In reality, when ASE boots, it allocates procedure cache and then takes 50% of the FREE pages and allocates them across the engines to the Engine Local Cache (ELC).  This ELC is where we grab proc cache buffers from when doing optimization or execution without having to grab from the common pool – thus avoiding the spinlock.  However, if all the available proc cache is used – or nearly so – then there is nothing left for the ELC.  As a result, when an engine needs more proc cache, it has to go to the global pool. Now, if memory is tight, it is likely more fragmented as we likely have been doing a lot of proc cache turnovers – and attempts to do large (16KB) allocations fail and so we retry with 2KB allocations….and since this works, the next thing we know we are happily running with TF753 (disables large allocations) and periodic dbcc proc_cache(free_unused) and heaven help the poor fool that suggests we simply haven’t sized things correctly.

But you haven’t.  The least little bump….like a more complicated query such as those EOM reports….and proc cache is exhausted.  What happens then is an engine sees it has nothing in ELC, requests memory from the global pool – finds there is none – and then ASE simply tells every engine to stop what it is doing and flush any free ELC proc buffers back to the global pool so that engine 1 can do it’s query.  That hurts. Mostly because as soon as engine 1 is done, engine 2 now is screaming for proc cache and we get to repeat the debacle.  And the proc cache spinlock heads into warp speed.

The answer – you guessed it – take your computed proc cache and double it.  So, if you think you need 4GB of proc cache – configure for 8GB. Then if your proc cache spinlock is still high it is likely due to another reason like frequent turn over (e.g. exec with recompile) or other cause.  But at least you can get off the crutch of dbcc proc_cache(free_unused) and TF753.

It makes me shudder every time I hear some supposed internet guru telling a customer to use them….without addressing the real problem.  By all means – use the crutch – but get the leg fixed and then ditch them.  Right size your proc cache.

To report this post you need to login first.

25 Comments

You must be Logged on to comment or reply to a post.

  1. Simon Ogden

    Hi Jeff,

    I might give a more detailed reply at some point but unfortunately I’m really snowed under at the moment so the best you get is an ‘on the commute’ response 🙂

    Why have you lumped ‘free_unused’ with 753? They are entirely different and I completely agree with you that the dbcc command should NEVER have been recommended by Sybase at the time. It is ridiculous to suggest a command that forces hard-earned plans from the cache simply to mask the underlying issue (which is not always down to sizing).

    However, having said that, it was Sybase engineering that advised on its use, not technical support. This was mainly as their entire methodology was to hand out crutches to anyone who looked like they might be suffering. I’d say that is still the case now under SAP. It may not be their choice but they are not always in the business of fixing the underlying root causes strategically.  You guys even went to the trouble of documenting the command in the official docs. All support did was do what Engineering asked them to do and firefight with tactical workarounds.

    Traceflag 753 however is no bad thing. Without it, even if you have 3 times more proc cache than the calculations say, it can (under specific circumstances) become fragmented mainly as there is nothing built in to the memory manager to reclaim smaller chunks and rebuild larger contiguous memory etc. This isn’t just a theory, this is from specific in-depth testing of workloads, in my case gathering real time risk data.

    If you can cut out overhead of compilation the best you can (I.e force plans where possible and avoid anything other than controlled recompilation) then the predictability and performance of 2KB allocations can be better. Predictability because you are known to be looking for 2KB each time and it’ll likely find it versus asking for any size up to 16KB and maybe finding it quickly, maybe not finding it all – the point being it’s unpredictable by default and we don’t like unpredictable.

    There is also a significant issue with caching of large statements as it appears they need to pull contiguous pages in order to store the statement cache object. If you have a query with a few thousand values in an IN list the statement becomes huge. It never finds that amount of contiguous space and (in our case) runs forever simply repeatedly flushing all the ELCs (with 6GB proc cache free out of 13GB). We have to run TF 7790 to make the server even useable, another crutch supplied by SAP.

    I agree with you that proc caches do seem to be in general undersized, but how can you blame anyone for looking at the diagnostics you guys provide that are supposed to tell how much is free and act on that information? To a DBA they see proc cache never going above say 80%, they are obviously going to think it’s fine. It’s up to SAP to provide the means to help the DBA do their job.

    Cheers,

    Simon

    (1) 
    1. Mike Willett

      Excellently written Simon as usual.

      As I’m sure Simon remembers, he’d helped us fix problems with our procedure cache. We tried doubling it and the doubling it again so its now 16Gb. Didn’t really want to double it again – I want to keep as much data as I can in cache – not use it for procedure cache.

      Simon is correct in saying it was ASE support who suggested running free_unused as well as trace flag 753. Under Simons advise – we ignored the free_unused and just used the 753. This got rid of the problem.

      If the choice is taken more space from data cache and allocate to procedure OR to use traceflag 753 0 then its an easy choice.

      (0) 
    2. Jeff Tallman Post author

      Not sure what the comments above are trying to say.   Let me clear.   It is my opinion that many use cases (arguably most) of dbcc proc_cache(free_unused) as well as TF753 go beyond what they should.  The PROBLEM is – and this is a huge gap that I recognize (but have no control of the solution) – is that in order to find out what is driving the issues, one needs a LOT of data over time to analyze and find what are the possible root causes (plural).   Unfortunately, sometimes it is easier for someone from support to give you a crutch and get you off the phone – close the case – ESPECIALLY as closed cases is a common industry metric – than to spend hours or days analyzing the information necessary to determine the real cause.

      Also, just because sp_monitorconfig doesn’t show 100% (or close to it) doesn’t mean there is some contention.  In many cases – for example, you cite proc cache at 80%- to me, that proc cache is could be constrained in some situations.   For example, given that a typical query (and not the ones with hundreds of IN() clauses) needs 2-4MB of proc cache during optimization, when you have 30 engines online, that translates immediately to having 60-120MB of free proc cache just to satisfy the possibility of 1 query being optimized per engine per instant in time.   If we consider 1000 active users, my suspicion (and I can’t prove this because AFAIK there is no easy way to find out) is that 25% of these are likely doing some form of optimization – whether proc re-optimization due to whatever reason or a new query.  That means 250 users doing optimization.   Over 30 engines, that translates to 8 or 9 per engine.  If we use the lower value, than each engine needs minimally 16-32MB of proc cache in its ELC to avoid constantly hitting the spinlock and the global pool.  30x (for the engines) means 480 to 960MB of proc cache FREE in the server.   If proc cache is at 4GB, 80% could mean it is getting a bit sparse in the ELC areas.

      We also have to recognize there are other factors than just proc cache sizing and spinlock contention – a blog cannot cover all the ramifications.   For example another common reason for high proc cache spinlock contention is middle tier applications using connection pools and fully prepared statements that grab and release connections from the connection pool rapidly (as many OLTP apps might do).   This lead to the the ‘streamlined dynamic SQL’ option as a way to divert fully prepared statements from constantly being created and dropped with every switch in user of the connection in the connection pool.

      There are still users out there with valid reasons for using dbcc proc_cache() or TF753 – unfortunately, I don’t think enough has been done to determine why they are still necessary.   Could it be we need a memory manager to defragment memory???  Not so sure as this implies also MOVING procs around in memory which needs spinlock access as well.   Personally, I think a better solution might be to divide proc cache into multiple caches/cachelets – e.g. a dedicated cache for histogram stats, a hot proc/trigger cache – and move fully prepared statements to PSS area.

      There is also the issue of allocation sizes – Mike Willett raised an interesting point about why 2K at all vs. 16K by default in a different post.   Most plans that I have seen for even the simplist of queries need 20-30KB of cache.  The problem is that the current memory management in ASE was written decades ago when we all remember that 1GB of memory was rare.  Re-writing the memory manager (as well as other areas of ASE) likely should be done…..but the horrors of destablizing the server to do so could be a tad risky…..    Until then, we simply need better visibility into the server (and much of this could be better MDA counters…..but that is a topic for a different day as well).

      (0) 
      1. Simon Ogden

        My comments are not trying to say anything apart from what they say really 🙂

        You shouldn’t be lumping 753 and dbcc proc_cache (‘free_unused’) in the same bucket in my opinion, they are very distinct and separate commands.

        Both you and I know that 80% proc cache is most likely not OK. What I am trying to say is that not everyone out there has a granular understanding of the finer points of procedure cache management, nor do they want to or have the time to. They expect clear, concise advice about how to size these caches, not continually changing advice dependent on whether the vendor is in firefighting mode or ‘really thinking about it’ mode.

        What they don’t want is to be continually told they are wrong for following the advice which has been given at various times. As I stated, this isn’t advice given by ‘internet gurus’ it is advice given by your engineering teams which some folk have adopted without question and others have realised was questionable.

        The proc cache and its nuances in ASE 15 and above have caused a world of hurt for a number of customers. Some acceptance of that might be a good step instead of folk again being told they are doing it wrong!

        I totally understand that the blog post here is ultimately designed to help people out, which is a great thing. But…you can’t ignore the 11 years in between where there were endless problems with proc cache and iteration after iteration of release by engineering to try to resolve them. Also, if you put yourself in the position of a new user of ASE wanting to know how to size proc cache, they will be left dumbfounded..I thought ASE was all about lower TCO 😉 .

        (0) 
        1. Jeff Tallman Post author

          Simon –   Unfortunately, you are correct – a lot of TS folks were also guilty – and STILL are guilty – no excuse for them either – and sorry, but a lot of the engineering advice is just to get the system stabilized under the assumption that someone will do the investigation to find the root cause.  I don’t think you would find an engineer who would raise their hand to ever suggest that such advice was given as a permenant solution – your allusion that it was is a bit off point.

          Given enough decades, one can always go back in time to any version of ASE and find bugs.  It is counter productive to associate 15.0.3 issues with 15.7 and latter releases – especially as the earlier releases are long since EOL.   While there may be customers still on them – there still are people in the US driving 15 year old vehicles – comparing today’s version of the same car to that one is extreme disservice as since then improvements in engine technology, transmission, suspension, etc. make it ridiculous to do so.

          My comments on the depth of our documentation wrt to tuning/sizing is yet another discussion – but that assumes the ‘new user’ read it to begin with…..

          I also described TF753 accurately – if you read it closely, I particularly stated it disable the large allocations.    It is just that the two crutches are often used together.

          (0) 
          1. Simon Ogden

            Yep, we are pretty much agreed but don’t forget dbcc proc_cache (‘free_unused’) has gone through the whole process of officially being *documented* in the migration guide which made it onto the core docs, not in ‘known issues’. This would have been OK had it not been portrayed as a fix rather than a last ditch workaround. Hence folk began to see it as standard command. A lot of people ran with it and didn’t report the issues to Sybase, lessening the chance of root cause (which was known anyway imo).

            I totally agree that an engineer wouldn’t generally like to push a workaround like this but needs must and all that…it happens, sometimes reluctantly.

            With 753 my objection is still that you have said both are used when the proc_cache is undersized, this isn’t always true and 753 has  solid use cases for permanent usage and you could argue it should be default (but I think we’ve covered it now). It is not a crutch for anything.

            I am not sure there is any comparison been made to the older versions, it is understood it has changed significantly, I was just outlining some of the early problems in earlier versions.

            Agreed a new user is unlikely to read the docs on proc cache anyway, but then again a new user can’t find the docs on the SAP website in a reasonable amount of clicks. 🙂

            (0) 
            1. Simon Ogden

              Jeff,

              One thing I did notice that I’m not sure is correct is the space required for the statement cache plans.

              I did some investigation into this a few years ago and determined that the ‘statement cache size’ controls the maximum size of the sql text portion of the statement cache (the ‘Statement Cache’ fragment pool) PLUS the size of the all the PROC_HDRs in the proc cache header pool associated with the plans (PROCBUFs) for the statement cache statements.

              This ‘statement cache size’ value will initially be taken from the existing procedure cache memory pool until more space is needed and then it will dynamically increase. At bounce the total proc cache memory pool will be created in full by summing ‘procedure cache size’ + ‘statement cache size’. As far as I can determine there is no requirement to increase ‘procedure cache size’ when configuring a ‘statement cache size’.

              The number of ‘actual’ procbufs (stored separately, only 300 bytes each, just pointers to the proc_hdrs that hold the plan information) in the MRU/LRU chain is determined from the numeric total of statement cache size and procedure cache size (i.e. if proc cache were 7000 and statement cache 7000 this would produce a procbuf memory pool with 14000 objects).

              I just did a quick sanity check on the latest 15.7 and I think this is still the case. Still yet to try 16 unfortunately.

              Cheers,

              Simon

              (0) 
              1. Jeff Tallman Post author

                While that may sound great in theory (statement cache sizing is enough), in experience with nearly every BuS system I have worked on, we have found that you have to bump up proc cache with statement cache.   My suspicion (no time to confirm) is this is due to concurrent execs with multiple plans in cache.   Custom sites with individual logins might not see this as much perhaps – but sites with common app logins still might.

                (0) 
                1. Simon Ogden

                  Rarely do I give an opinion based on ‘great in theory’ 🙂

                  Does Business Suite still leave all prepared statements open for the duration of the connection? If so, then even without the statement cache enabled, you still have a huge proc cache footprint as every unique statement for every connection will have a plan in the proc cache until disconnect.

                  Ok, so statement cache and streamlined dynamic SQL help here and allow them to be shared based on hash key but unless there are enough slots in the statement cache to store every unique statement you’ll end up with essentially legacy lwps sat in the proc cache that don’t tie up to a statement in the statement cache every time it has to grab an existing statement from lru in the statement cache (as it cannot purge the corresponding lwp).

                  Good point on the multiple plans per statement cache entry though, BS is obviously going to see a higher proportion of these and I don’t know the impact without testing on whether it will still restrict the total number of statement cache plans to the total number of unique statements. My point is a significant portion (the majority in many cases) of ‘statement cache size’ is created as regular proc cache for use by statement cache plans.

                  A bugbear with the lwps is they also contain the sql text of the statement so this information is duplicated which seems a waste of memory.

                  (0) 
                  1. Simon Ogden

                    Late last night (OK, early this morning) whilst we had some prod issues I ran few tests.

                    I think there is potentially a significant bug underlying this.

                    Fully prepared statements (under streamlined dynamic SQL) are NOT tracked as belonging to the statement cache.

                    The implication of this is that when the ASE decides whether it can fit a new statement in the statement cache it works out whether the amount of proc cache (used by lwp plans) plus the statement cache has exceeded ‘statement cache size’, if it has it is supposed to flush out a statement. If you run for long enough with prepared statements it ends up filling the entire ‘statement cache size’ with JUST the statement cache and essentially whatever amount of proc cache it requires to store all those plans. In the Business Suite world as there would be multiple plans for each statement it would never purge anyway. But, the total amount of proc cache available for the statement cache is still *supposed* to be controlled by ‘statement cache size’. I guess in BS, you want the amount of proc cache the statement cache can use to essentially be limitless as it probably can’t purge the statement.

                    However in non-BS, this behaviour can mean that if you run with streamlined dynamic SQL and fully prepared statements you might be using far, far more proc cache for statement cache than you intended (which is why you then have to increase proc cache still further).

                    The ‘statement cache size’ cfg is by design the total amount of space that can be used by statement cache and all memory associated with the plans for those statements (including multiple lwps for the same statement). The figure will added to the existing proc cache. Testing shows that with regular traditional statement cache (i.e *ss), statements get flushed out when the total memory footprint of sql statements + plans exceeds this cfg param.

                    (0) 
                    1. Jeff Tallman Post author

                      I hate SCN – no matter which browser – IE or Chrome – it dies constantly and the formatting by jamming stuff all the way to the edge makes it impossible to read long conversations.

                      However, was busy last few days, but in the past I have had way too many conversations with engineering (and Dave P on occaision) about statement cache due to memdumps (proc cache only thankfully) and MDA data (plus dbcc output).   The gist (and I know you Simon may be aware of some/most of this, but repeating for others sake):

                      • Total procedure cache = statement cache size + proc cache size (based on cfg file)
                      • The amount of memory for statement cache size is reserved for the statement cache – e.g. procs can’t infringe on it (nor lecache objects).  However, according to engineering, statements don’t have the same restriction.   However, real world observations is that the statement object (less plan) does seem to be limited to the statement cache size while the plans seem not to be.
                      • A fully prepared statement (aka LWP) is associated with a specific spid/connection.  If that connection dies, it is dropped.  However, it also means that if you have 1000 connections from the same app all doing the same work, you will have 1000x the proc cache usage (assuming each connection has the same number of fully prepared statements in use).
                      • In most app servers, when a connection is released back to the connection pool, all fully prepared statements are dropped along with other connection context info reset.  This can result in significant proc cache spinlock contention if applications frequently grab a connection from the connection pool – exec a txn – and release the connection back to the pool.
                      • A statement in the statement cache is associated with an suid and is not tied to a connection.  One spid can connect as login ‘fred’ create a cached statement by executing a query and then disconnect.   If ‘fred’ logs back in again, and executes the same query, the pre-existing statement in cache (assuming it is still there) can be reused – even if different spid.   Consequently, apps using a common app login (such as Business Suite) can ‘share’ cached statements, but not fully prepared statements normally. (I wrote the original TechNote on this one based on work Dave P and I did for a customer).
                      • In ASE 15.7, ‘streamlined dynamic SQL’ was implemented to reduce the LWP memory explosion and spinlock contention of app server connection pooling by exploiting the statement cache.  What happens when an app requests a fully prepared statement is that a scan based on the hashkey is done through the statement cache.  If found, the cached statement is invoked instead.  If not, I *believe* (based on observations) that the LWP is created first and then the cached statement (see below as to why)

                      In many situations, I have observed with Business Suite that it is possible to have a statement in the statement cache without a plan attached (no object in monCachedProcedures).  I suspect if the original connection disconnects, it drops the LWP as a normal prepared statement would be (I have not had time to test this however).  As normal, the statement remains behind.   When executed again (and/or concurrently) a new plan is created (I hesitate saying LWP as I doubt it has a proc handle to be invoked as an RPC as a true LWP has a proc handle and can be invoked via RPC….so I will just say it is a new plan in proc cache).  I *suspect* that the original LWP and statement object are created in statement cache, but I *believe* (based on numerous sets of statistics from BuS systems and non-BuS systems) that subsequent or concurrent plans on cached statements are simply created in proc cache. Hence my proc cache sizing puts them outside statement cache for simplist rules (vs. 90 “it depends” that make sizing impossible to quantify)

                      (0) 
                      1. Simon Ogden

                        Agreed this formatting is ridiculous. Is it on a list somewhere to be fixed?

                        Point 2 on your list is not the case from testing, the only plans which are in essence unrestricted are fully prepared statement plans and this behaviour is unintended.

                        Can you have a read through my last post and take that back to engineering? The statement cache allocation is tracked by summing the total used by the statement cache fragment pool plus the proc cached used by their plans (a counter holds this info). When it gets up to ‘statement cache size’ it pushes out statements. Fully prepared statement plans incorrectly do not get included in that counter thus you actually get far too many statements in the fragment memory pool and even more proc cache to hold all their plans.

                        (0) 
                        1. Avinash Kothare

                          Re : formatting of the thread

                          View the whole thread as a PDF file using the link View as PDF right under “Actions” on right hand top corner where this thread begins 😎

                          For today’s responses you can then search by “Apr 29”  !

                          HTH

                          Avinash

                          (0) 
      2. Mike Willett

        Simon,

        When you talk about the procedure cache and its nuances in ASE 15 – do you mean all versions of15 (ie 15.0, 15.5, 15.7 SPxx and SP15.7 SPxxx).

        We ran very well on 15.5 with no procedure cache issues. We then migrated to 15.7 SP1xx (not out of choice) and had procedure cache issues. We’d made no changes to the application  so the issue for us was only visible in SP15.7 SP1xx – not under 15.5.

        What I fail to understand is why have different page allocation from procedure cache. After a few days of grabbing and freeing 2k, 4k, 16k pages then you’re invariably going to find it difficult to grab a 16k page eventually.

        Who said ASE has a lower TCO ? Lower than what other DB ?

        (0) 
        1. Simon Ogden

          In early versions of 15 there were far more proc cache issues than on the current versions, trust me! Different types of proc cache issues, 701 errors, stack traces, all kinds of ‘fun’ stuff.

          The multi-page allocations were not originally in version 15, they were added on the back of what were very significant issues in query optimization. The sheer overhead of compilation on version 15 (particularly in the early days) was so immense that it led to them adding in the large scale allocations to speed up the process of getting that memory as oppose to grabbing a page at a time. After this, all the kerfuffle around dbcc proc_cache(‘free_unused’) started as the resultant (relative) fragmentation was pretty much unavoidable.

          Maybe they should’ve targeted the memory requirements of the optimizer but all is said and done now, Jeff may be able to elaborate.

          Not sure why you were OK on 15.5 but not on 15.7, your experience may not be typical but every shop is different.

          Lower TCO has historically been the USP for ASE. Google is your friend on this one 🙂

          (0) 
          1. Mike Willett

            > Not sure why you were OK on 15.5 but not on 15.


            I suspect that’s because we moved to 15.5 very late  – when I suspect it had stabilised.

            We missed out all the 15.0 versions – we were on 12.5.4 for a good many years. That was one of the products that really did have a low TCO. Don’t think we called support once.

            (0) 
  2. Artem Maystrenko

    Very interesting discussion.

    It is possible to understand as for what it is used in a procedural cache

    really, the constant sentence bothered from support SAP on any problem with a procedural cache(about)

    Dbcc proc_cache(free_unused)

    Trace flag 753 (and 757) 

    constant response.

    I had a problem with a procedural cache (colleagues recommended me a lot of things for check).

    If talk how to check, from where a problem came, then I decided to do so: (can to someone else it is useful)

    1.

    I collect data from request

    select *

    from master. monCachedProcedures

    where objectName not like ‘rs _ %’

      and requestcnt> 1000

    I take 2 samplings and I receive a difference on procedures(I select the greatest execution in time)

    I understand what procedures which are most used!

    2.

    I take this procedure and I watch its plan.

    I understand that in it it is the most expensive.

    Then I optimize it

    3.

    I watch the biggest procedures (in the place in a cache) on the volume of the occupied memory

    I try to optimize them also

    4.

    I look

    select sum(active) as Act,sum(numpagesreused) as Reu into #a2

    from master. monProcedureCacheModuleUsage 

    for viewing reuse

    Total:

    in 2 weeks it was succeeded to stabilize operations of the DBMS more or less.

    Dbcc proc_cache(free_unused)

    Trace flag 753 (and 757) 

    I don’t use

    The only thing that I don’t understand that it * sq0571184887_1310411587ss * on tempdb? And how to fight against their big execution(statement cache switch off)

    (0) 
    1. Jeff Tallman Post author

      *sq057…._131….ss* refers to a cached statement.  You will see similar but slightly different naming scheme for fully prepared statements.   For statement cache, the first part (shooting from memory) is the object id in tempdb and the second number (131..) is the SSQLID – I may have them reversed.   For statements in statement cache, you can use show_cached_plan_in_XML(SSQLID, [<PlanID> | 0]) to retrieve the qp in XML format – I find it extremely useful as it has both the original variable values compiled with as well as the the most expensive execution parameter values……with a bit of twisting with xmlextract(), etc. you can regenerate test queries.   To see the statement stats, simply select from monCachedStatement where SSQLID=<SSQLID>…..

      (0) 
      1. Simon Ogden

        *sq is either a fully prepared statement or an ad-hoc language cursor running with ‘streamlined dynamic SQL’=1, see the below link for details.

        First bit is object_id/SSQLID/lwpid (one and the same id for a statement), the second bit is the hashkey (unique for a sql statement and persistent across bounces). If you have ‘enable stmt cache monitoring’ on you can search by that hashkey in monCachedStatement and find the sql text with show_cached_text().

        https://scn.sap.com/community/ase-custom-applications/blog/2013/05/22/what-is-in-my-statement-cache

        (0) 
  3. Andrew Melkonyan

    Hm.  To pick up upon the crutches metaphor – Proc Cache is the Achilles Heel of ASE.  Crutches are there to feel less pain (fear?) when you step on it.  And since Achilles Heel never really heals…

    A couple of observations (from an internet guru who still remembers the excitement of tending spinning ASEs – in my case CEO was delivering ice poles to cool us down – not all CEOs are so well composed in time of chaos):

    1. Your sizing is based on multiplying arbitrary plan sizes (200 KB – 2 MB) by the number of objects of corresponding types + average concurrency.  Why using this approach rather than relying on HWM from the PC MDAs (which reminds me that as of ASE 16.0.1.2 this is still zeroed for both with case attached to it – perhaps that’s the answer)?

    2.  PC/SC sizing must take into account that plans/LWPs are not reused across different logins (unless another crutch is used – TF7738).  There are customers with 1000s of distinct logins. Sizing PC/SC for them is different (any note on it in documentation would be nice).

    3.  Doubling the PC size ad infinitum is nice – but I remember (from memory) an installation guide advising not to size PC to high values on Solaris platforms as in some case ASE may hang on start up (not sure if this is still relevant – hopefully not?).

    4.  TF753/757/758 &c:  753 has been empirically “proven” to be more effective (given a single PC cache chain – any work done on dividing it?).  The problem with large chunks/auto-tuning has been that with fragmented memory the time it took to locate a new free chunk was longer since the locked up chain has to be scanned multiple times for each differently sized chunk, thus keeping spinlock longer.  Some TFs have been translated into config parameters (which sounds like turning a crutch into a permanent protease?) – with exception of 753 which for some reason is still kept as TF.

    4.a.  As an elaboration on  this, running free_unused on small PC chain will probably take less time than running it on large 10x GB sized PC.  What is the risk using the old-fashioned crutches on the new-fashioned PC sizing?   Will not the chances to have ASE frozen by this dbcc command increase with size of PC?

    5.  The “real” problem with PC sizing is – because it is sooooo sensitive to the type of workload (degree of concurrency, login diversity, session lifetime) it is absolutely impossible to simulate it in the testing environment.  DBA’s may feel cosy and warm running their intricate tests and reporting to the managers their complete satisfaction with the results (ASE rock-solid stability, amazing throughput) only to find themselves in complete mess with going live.  The only tools (crutches?) they will have in that case is the old good dbcc commands which will now become more risky to use under new ASE config.  If you ask me, I’d distribute Workload Analysis feature to ASE customers for gratis and work on making it available to earlier releases of ASE for those still using it – as a way to mitigate the futile efforts to test new ASE releases prior going live.  Sometimes thinking customer pays back more than licensing a useful feature.  Competitive edge begins here.

    6.  The reason customers still use crutches in production environment is in TS inefficiency.  TS simply has no time to invest in singling out the reason why some customers are still stuck in their mess.  It is a problem with TS approach.  TS approach currently is – return the environment to operational state by whatever means without going into lengthy process of investigating what caused ASE to become in-operational.  TS has very seldom been able to reproduce the problem internally (not enough internal resources?).  Customer has very seldom been able to continue supporting TS in their investigations (prolonged instability of the production environment is not an option).   Again – having workload analysis available might have changed this.  From my experience, the situation is:  DBAs find a working workaround (with/without TS help) and move on having no time/tools to invest in solving the problem by singling out its root cause.  I think the real frustration at seeing the customers using crutches where they should not should be aimed at vendor’s inability to genuinely help customers out of their situation.  DBA’s lack of knowledge in ASE sizing only masks this frustration.

    Cheers,

    Andrew

    (0) 
    1. Jeff Tallman Post author

      In response:

      1 – sizing is often an exercise when the system is not available or being built.  I can ask someone how many concurrent queries they are planning or simply use 10% of the total connections they have planned (or 10x the engines) as both seem on average to be fairly reasonable rules of thumb.  Likewise – except in places where procs are heavily used (and in place of prepared statements/statement cache as procs involve fewer spinlocks), in general or on average, 10% of the concurrent queries use procs.   I don’t know why HWM is zero’d – suspect a bug.  You can use Active instead so long as you recognize that it actually *could* be higher just before any sample…..but likely not appreciably larger if sampling on a decent interval.  You can also use monProcedureCache/monStatementCache (or the equivalent in sp_sysmon) and so long as the number of removals/inserts for either does not approach 1%/minute sampled (e.g. 10% for 10 minute sample) vs. total procs/stmts in cache – then the proc/stmt cache is likely sized okay.  Maybe not optimally, but not impacting performance negatively on a large scale.

      2 – I am not yet in favor of sharing plans – used to be a cfg and I was one of the early testers of it.  Originally, it was supposed to be fixed in sp100 – not sure – but I also am not sure of the security ramifications – especially if using predicated privileges/FGAC……

      3 – doubling as a rule works in smaller values – however, unless it appeared severe, I would not suggest going from 8GB to 16GB.   I think it is wiser to move in incremental steps – e.g. increase by 1-2GB and re-observe.   Doubling is faster….but……  Remember, most memory caches have some form of hash table.  Unless increasing the cache increases the number of hash buckets, the hash scan becomes much longer (there were significant issues with this in early statement cache where a static number of hash buckets were used and the hash scan down the chain was taking longer than the optimization would have).   Remember, most cache reads (e.g. bufsearch()) grab the spinlock, so long hash chains result in higher spinlock contention (hence the comments about lock hashtable size wrt lock hash chain length output in sp_sysmon) – so GROSSLY oversizing proc cache can lead to increased spinlock contention as well.   Consequently on really large proc caches, I would be half tempted to reduce the ELC size back down from the 16.x 50% default to the 12.5/15.x defaults of 25% (actually hard coded – not configurable).   But this would take some monitoring to tweak and shouldn’t be done lightly.

      4 – I think a lot is situationally dependent.   The only way to truly verify cache fragmentation is via memdump.  But without a cache cleaner, the reality is that all caches will be fragmented over time.   TF753 suggests that most (if not all) the free space freed by normal  proc cache deallocations are 2KB and that larger chunks are rare.   Implication is that 50% or more of proc cache is in use (just think about it – if every other 2KB chunk was allocated – 50% is allocated and the rest is fragmented).  Someone put on here that they are running TF753 and ELC of 80%….and that to me is extremely counterintuitive and very much suggests a undersized proc cache.  But without any detailed data based on a reasonble sample interval, I can’t prove that (or disprove myself).   I can only state that sometimes what is provided by support is merely a workaround to stablize the system and the full RCA needs more exploration – doesn’t mean the stabilization workaround is intended for long term usage.

      (0) 
      1. Mike Willett

        > Someone put on here that they are running TF753 and ELC of 80%….and that to me is extremely

        > counterintuitive and very much suggests a undersized proc cache.


        We run ELC at 80% and TF753.


        Since moving 15.7 our procedure cache has quadrupled to 16Gb.

        Without the TF753 and ELC at 80% we found the server practically freezing for 20 – 30mins.


        Given the choice between increasing the procedure cache to 20GB or 30Gb (and thereby decreasing the data caches by this much)  or runing with TF753, I’ll stick with TF753.


        Grabbing different sized memory allocations from a single bucket and expecting this not to become fragmented isn’t sensible on a high volume system.


        (1) 
  4. Ho Raymond

    I am a happy user of clutchers in ASE 15.7 SP13x Linux

    > Active traceflags: 753, 758, 3604, 15891

    I have written 2 extensions to the shareware “sp__tsnap” to do health checks on Proc / Statement cache utilisation (see sample outputs below).

    • I have configured ELC to be 80%
    • 100% of Statement Cache usage does not bother me at all
    • as long as HWM Proc Cache remains below 90%;
    • the Proc Cache spinlock cost for the ASE engines to grab up to half of the 20% Global Pool is affordable as it does not bring the ASE dataserver to its knees (thanks to the TF 753/758 glutches)
      WARNING – This cost would be outrageous if the Global Pool is on its last buffers as the Proc Cache spinlock would push CPU usage to 100%?!
    • Based on the above guidelines, I have hardly everhad to need to use the other clutch dbcc proc_cache(free_unused).

    The caveat is

    • monitor Proc Cache usage to keep an eye on its HWM;
    • aim to always reserve10% or more headroom in the Proc Cache’s Global pool
    • do add more RAM to allocate more shared memory to size Proc / Statement Caches correctly (clutches may be cheaper than h/w but constantly limping with the cltuches as Jeff pointed out is not a nice way to manage ASE);
    • the samples below are taken from a small ASE with 59 G shared memory by today’s standards – my other ASEs have 128 G & 160 G shared memory but the same principles apply in keeping Proc / Statement Caches sweet;
    • convince the DEV teams to use generic application logins to reduce Statement Cache spinlock contention;
    • when both free buffers in Statement Cache & Proc Cache are almost exhausted then Jeff’s scenario (ASE engines taking turns to get all engines to flush ELC buffers to Global Pool while a single ASE engine attempting to scavenge sufficient buffers from Global Pool) will kill the ASE server performance as CPU hits the roof.

    If anyone wants copies of the sprocs to generate the sp__tsnap pc & sc reports, give me a shout.

    1> sp__tsnap pc
    2> go

    *****************************************************************************************************************************
    + rpc10_1 ASE Snapshot Report – Procedure Cache Summary Mode +
    *****************************************************************************************************************************
    Procedure Cache (PC) Configured Size : 2,000,000 (2k) Pages
    Engine Local Cache : 1,768,000 (2K) Pages (80% at ASE startup)
    Currently Used : 1,274,718 (2k) Pages (63.00%), HWM 1,720,384 (2k) Pages (86.02%)
    Global Cache : 4,678 cached procedure QPlans, 1,061,934 (2k) Pages (53.10%)
    : 5,405 cached statement/LWP QPlans, 171,762 (2k) Pages (8.59%)
    # LWPs + Stored Procs Requested : 556,272,623
    # Stored Procs Loaded : 406,524
    # Stored Procs Written : 893
    Statement Cache (SC) Configured Size : 210,000 (2k) Pages (it may grow to this limit within PC)
    Statement Cache : 4,977 cached statements, 61,868K SQL texts, 209,995 (2k) Pages (100.00% of configured)
    Report Time : May 5 2017 6:39PM
    Time Since Last ASE Boot : 2 months 1 day

    1> sp__tsnap sc
    2> go

    *****************************************************************************************************************************
    * rpc10_1 ASE Snapshot Report – Statement Cache Summary Mode *
    *****************************************************************************************************************************
    Statement Cache (SC) Configured Size : 210,000 (2k) Pages
    Currently Used : 4,977 cached statements, 100.00% or 209,995 (2k) Pages
    Literal AutoParam : Enabled
    SC Hit Rate : 99.33%
    # Removed Due to LRU algorithm : 455,990
    # Removed Due to Schema Change : 1,961,150
    # Removed Due to lack of Space : 402,961
    Report Time : May 5 2017 6:39PM
    Time Since Last ASE Boot : 2 months 1 day

    1> sp__tsnap mem
    2> go

    *********************************************************************************************************************************
    * rpc10_1 ASE Snapshot Report – ASE Memory Usage Summary *
    *********************************************************************************************************************************
    Max Shared Memory : 56.90 Gb, 29,831,987 (2k) Pages
    Total logical Memory : 52.71 Gb
    Total Configured Memory : 39.66 Gb
    Allocate All Memory at Start-up : True
    Top 20 ASE Config Memory Usages Memory Pages Logical Share % Overhead
    ——————————- ———- —————- ————— ———–
    heap memory per user*: 4.4 Mb n/a n/a
    Procedure Cache (3.80 Gb): 4.31 Gb 2,257,901(2K) 8.17 % 503.71 Mb
    Data Cache (29.00 Gb): 32.81 Gb 17,203,239(2K) 62.25 % 3.81 Gb
    CIPC regular message pool size: 1.63 Gb n/a 3.09 %
    Lock Structures: 10.41 Gb n/a 19.74 %
    Lock Hashtable: 864.04 Mb n/a 1.60 %
    Kernel resource memory: 668.28 Mb n/a 1.24 %
    Executable Code Size: 539.90 Mb n/a 1.00 %
    Statement Cache: 463.05 Mb n/a 0.86 %
    User Connection Structures: 411.21 Mb n/a 0.76 %
    Disk I/O Structures: 280.00 Mb n/a 0.52 %
    Open Object Structures: 269.49 Mb n/a 0.50 %
    Open Database Structures: 68.99 Mb n/a 0.13 %
    workload manager cache size: 39.33 Mb n/a 0.07 %
    Open Partitions Structures: 31.75 Mb n/a 0.06 %
    Online Engine Structures: 29.79 Mb n/a 0.06 %
    Open Index Structures: 25.95 Mb n/a 0.05 %
    Transaction Descriptors: 22.28 Mb n/a 0.04 %
    Deadlock Pipe Buffer: 21.90 Mb n/a 0.04 %
    SQL Text Pipe Buffer: 11.21 Mb n/a 0.02 %
    ——————————- ———- —————- —————-
    Total Top 20s: 52.82 Gb n/a 100.19 %

     

     

    (0) 
  5. Ho Raymond

    Here is the at-a-glance call-card of the small ASE server (others are 15.7 SP136, SP137, SP138)

    1> sp__tsnap
    2> go
    ****************************************************************************************************************************
    * rpc10_1 Sybase ASE Snapshot Report – Version 1.5.4 *
    ****************************************************************************************************************************

    Attribute Value
    ————– ————————————————————————————————————————
    Report @ May 5 2017 7:19PM
    Host Name <masked hostname>
    IP Address <masked IP> (current session)
    Listeners <masked hostname> <masked port#> |
    Endianness Small
    Server Name rpc10_1
    Boot Time Mar 5 2017 4:41AM
    Run Time 2 months 1 day
    IO Busy % 7.368
    CPU Busy % 3.414
    Idle % 89.216
    Version Adaptive Server Enterprise/15.7/EBF 24758 Cluster Edition
    SP135 /P/x86_64/Enterprise Linux/ase157sp133x/3927/64-bit/FBO/Thu Jun
    Page Size 2,048
    Engines 20
    Opt Goal allrows_mix
    License ASE_CORE / Cluster Edition
    Type CPU license
    Language us_english
    CIS Version Adaptive Server Enterprise/15.7/EBF 24758 Cluster Edition
    SP135 /P/x86_64/Enterprise Linux/ase157sp133x/3927/64-bit/FBO/Thu Jun

    (0) 

Leave a Reply