Don’t get me wrong – crutches are useful things. But some of places still are hobbling around on crutches long after they should not be. What crutches am I referring to??
- Dbcc proc_cache(free_unused)
- Trace flag 753 (and 757)
Now, before some well meaning but misguided person assaults me for being anti-crutches – I, of course, recognize the need for them…at a point in time. But when you break a leg, the point is you go to the doctor, get it fixed and then only use the crutches until it heals. What you don’t do after breaking your leg is to simply dig out a pair of crutches and use them for the rest of your life without ever getting your leg fixed. That just isn’t natural.
And yet there are places that run with the above in place….for years. Years of pain unnecessarily as both the above point to proc cache being undersized.
So how do we correctly size it??
First, you have to understand the biggest consumers. Don’t jump in with “stored procedures (and triggers)” right yet ….for example, most SAP Business Suite systems run with multiple gigabytes of proc cache and there isn’t a stored procedure to be found. If you look in monProcedureCacheModuleUsage, you will see there are nearly 30 different allocators of procedure cache, including:
- Procedural Objects
- HK GC
- HK Chores
- BLOB Management
- Partition Conditions
- Pdes Local HashTab
- Statement Cache
- Frame Management
- AuxBuf Management
- Procmem Control
- Data change
- Dynamic SQL
- Cluster Threshold Manager
Of these, the top most common consumers typically are:
- Procedural Objects
- Statement Cache
- Dynamic SQL
With a few others sometimes contributing, such as
- Procmem Control
The two culprits that I think contribute the most to customers relying on the above crutches is grossly underestimating the proc cache requirements of the Optimizer+Execution and Statement Cache. The latter, is often directly due, I think, to the former – so let’s spend some time there first.
Every query – whether a simple “select * from table” to a complex proc – uses procedure cache. Every query. Get used to it. It starts when the query is first received and the SQL is parsed. Thankfully, this tends to be not that much – just a few pages of proc cache. Then we get to optimization….and OUCH!! This is a heavy consumer of procedure cache. The typical query optimization plan can run from a few low 10’s of KB for a simplistic “select * from table” to multiple 10’s of MB. This would also be true for simple inserts, updates or deletes using literal values. However, once joins start, this quickly jumps to ~200KB. More complicated queries such as large joins/nested subqueries can easily consume a few MB of proc cache.
But wait. Remember how we got that plan. First the optimizer had to develop a number of “work plans” as possible alternatives. Each of these as developed are kept in memory until the final costing. The number of work plans depends upon the number of tables involved as well as the number of indexes and predicates in the where clause – but it easily can be 10-20. Consequently, a single query optimization can use 2-4MB of procedure cache during optimization – and then drop down to 200KB once the final plan is chosen.
But then we are not done yet. Those familiar with 16sp02’s compiled query feature understand that once a query optimization plan is chosen, it is not the end. Nope. The query execution engine needs to develop an execution plan on how it is going to execute the physical operators in the query plan. For example, if a hash-based grouping is chosen, it needs to create the hash buckets (the number of which depends on the estimation), etc. Net result is likely that 200KB query plan needs another 200KB of proc cache for the execution plan. This is why if you pay much attention to monProcedureCacheModuleUsage that Optimizer and Execution are frequently consuming proc cache.
That is just for a basic query with a 2-3 way join and a few predicates.
Now, take that 6000 line stored procedure, which, likely contains a few hundred of these. Yep. Now we are talking 10’s if not 100MB of proc cache. But usually, most well developed procs need about 2MB. And remember, that is just for one copy. With each concurrent execution, yet another copy of the plan is loaded. Of course, remember – what we have loaded is just the procedure optimization plan – we still need an execution plan for each statement.
How can we use this for sizing? The answer is simple – how many concurrent queries is your system experiencing on a regular basis? A good estimation if you are not sure is to simply figure 10x the number of engines you are running. So, on a 30 engine system, it is likely there are 300 queries active at any point in time. If we figure that each concurrent query likely needs about 2MB of proc cache as a very loose estimate – then we are saying we need about 600MB of proc cache just in query optimization and execution – which is not unlikely. We also know that it will take probably 500MB of proc cache to hold 250 normal stored procs. Each statement in statement cache will need another ~100KB for just the statement – and each plan for that statement will need memory as well….. Ohhhh…..the math…it just hurts the head.
The better method is to simply think about the number of concurrent active sessions. Now, if you are the type of shop that runs with 10000 connections with most of those idle at any point in time, you might want to start monitoring monProcess periodically and then run a query against the collected data similar to:
select SampleTime, count(*)
group by SampleTime
order by 2 desc
For grins, let’s say the answer is 1000. Now we need to know how many are running stored procs on average vs. executing queries. All we need to do is to add a where clause of LineNumber>1 to the above query (yes – a batch can have more than one line….but then batch/stored proc – nearly same thing when it comes to proc cache for optimization). Again, let’s say the answer is 100 (nice 10%).
Now that we know our concurrency, we have to figure how much proc cache per user. Remember, in addition to currently executing queries, we need proc cache for statements in statement cache (in addition to the statement cache itself) and we need to keep some of our most frequently executed procs in cache. The latter – if we assume our 10% is hitting – may already be counted, but let’s use the following estimates:
- Each query needs 200KB of proc cache for optimizion – plus same for execution
- Each user will cache ~15 statements (between statement cache or Dynamic SQL – take your pick). Each statement needs 200KB for the plan.
- Each proc will need 2MB of proc cache – and we want to keep ~250 or so in cache (including all the concurrent plans)
We then can compute:
- 900 queries * 200KB per query * 2 (opt + exec) = 360MB for queries
- 100 procs * 2MB per proc * 2 (opt + exec) = 400MB for proc based on concurrency
- 250 procs in cache * 2MB per proc = 500 MB for proc cache (bigger – we will use this number)
- 1000 users * 15 statements/user * 200KB/statement=3GB
…..so, with just some back of the envelope planning, we can see that we would need >4GB of total proc cache. Let’s separate out statement cache – each 1GB of statement cache can hold 15000 statements (without the plans), so really we are talking 1GB of statement cache and 3.5GB of proc cache (for the procs, the queries and the statement cache plans).
At this point, mistakes are made. The typical DBA might set proc cache at 4GB and statement cache at 1GB and think the job is done. And then complains to support about proc cache spinlocks …..and support gives them the crutch of dbcc proc_cache(free_unused) and the problem goes away…not really. They have to run it every 30 minutes or whatever….so they are treating the symptom but not the problem. The real issue is that our MINIMUM requirements for procedure cache is 4.5GB. In reality, when ASE boots, it allocates procedure cache and then takes 50% of the FREE pages and allocates them across the engines to the Engine Local Cache (ELC). This ELC is where we grab proc cache buffers from when doing optimization or execution without having to grab from the common pool – thus avoiding the spinlock. However, if all the available proc cache is used – or nearly so – then there is nothing left for the ELC. As a result, when an engine needs more proc cache, it has to go to the global pool. Now, if memory is tight, it is likely more fragmented as we likely have been doing a lot of proc cache turnovers – and attempts to do large (16KB) allocations fail and so we retry with 2KB allocations….and since this works, the next thing we know we are happily running with TF753 (disables large allocations) and periodic dbcc proc_cache(free_unused) and heaven help the poor fool that suggests we simply haven’t sized things correctly.
But you haven’t. The least little bump….like a more complicated query such as those EOM reports….and proc cache is exhausted. What happens then is an engine sees it has nothing in ELC, requests memory from the global pool – finds there is none – and then ASE simply tells every engine to stop what it is doing and flush any free ELC proc buffers back to the global pool so that engine 1 can do it’s query. That hurts. Mostly because as soon as engine 1 is done, engine 2 now is screaming for proc cache and we get to repeat the debacle. And the proc cache spinlock heads into warp speed.
The answer – you guessed it – take your computed proc cache and double it. So, if you think you need 4GB of proc cache – configure for 8GB. Then if your proc cache spinlock is still high it is likely due to another reason like frequent turn over (e.g. exec with recompile) or other cause. But at least you can get off the crutch of dbcc proc_cache(free_unused) and TF753.
It makes me shudder every time I hear some supposed internet guru telling a customer to use them….without addressing the real problem. By all means – use the crutch – but get the leg fixed and then ditch them. Right size your proc cache.