We are in the midst of a lengthy project of migrating a large number of ASEs to 18.104.22.168 (stopped on 22.214.171.124 since we bumped into a strange bug with misbehaving unions with sort by column number which no one could reproduce/explain).
Migrating one of our ASEs we’ve got an input from the field that a conversion proc runs much slower on the new server than on the old server. Well, with the regular 80/20 approach it sounded like a thing to be expected. Not everything converts smoothly – some DBA activity is still required for the final finishes.
I’ve started to investigate – actually was extremely lucky to have both versions of ASEs and the possibility to rerun the code myself. DBA’s paradise. To be extra precise I decided not to stop on the regular showplan, plancost, time/io statistics but to delve into full query metrics and procedure memory footprint on each ASE. Pretty fast I’ve got stuck: based on all metrics ASE 16 executes the procedure faster than ASE 15. sysquerymetrics for all statements has better metrics for ASE 16. Plans generated by optimizer are either completely identical or better based on estimated IO. Field execution time – ASE 16 runs twice slower. Hm.
Lovely stuff. In addition I’ve noticed another pretty nasty surprise: tracing MemUsageKB in monCachedProcedures for the proc while it is executing I’ve observed that for ASE 15 the footprint rests somewhere below 1 MB, while for ASE 16 it grows continuously – up and beyond the size on the procedure cache. If you are lucky – the proc execution eventually ends freeing up the memory back into PC pool. If you are not (e.g. if you start “debugging” the proc execution adding time to write out various statistics) you eventually get the notorious 701 error and the proc memory is freed.
I do recollect from memory that sometime the same optimizer generates plans weighed differently for the same proc. I faced this unexplained situation in ASE 15.7: force optimizer to regenerate the plan (sp_recompile….) and the thing passes. But I did not recollect having a proc MemUsageKB growing constantly while the proc is active (and indeed even on the same ASE 16 server the same proc sometimes behaves normally – generates a plan, uses is with the MemUsageKB staying fixed on around the same value as in ASE 15).
The proc has a cursor that moves 150K rows from one database to another for 3 large tables. Nothing too complicated. Tested with insensitive / semi-sensitive cursors – same behaviour. Tested with cursor declared on select directly with 100 fields and cursor on temp table with 1 field – same thing. I’m really puzzled where my PC memory is getting wasted.
Tried to dig into MDAs: It’s possible, in theory, to see how much memory proc is using through monSysStatement: grab the proc PlanID, trace the proc statistics per-line, per-statement. You should be able to see which proc lines are heavy – on cpu/io/memory level (btw, there is a KBA & CR to fix wrong LIO reporting for insensitive cursors for 126.96.36.199 et al – but this is probably irrelevant to my case). For my cursor I may see that each loop consumes around 600KB memory. At the same time, monCachedProcedures report an increase of around 60K after each loop (and throughout the loop each statement has various memory footprint by itself). Can it be that instead of releasing memory it keeps a portion of it for as long as the plan is active?
Based on monProcedureCacheMemoryUsage the Allocation which constantly grows alongside monCacheProcedures/monProcedureCache is MEMC_SCOMPILE_1 (has constant increase in Active and occasional Reuses). Same with monProcedureCacheModuleUsage – ProceduralObjects.Active keeps growing, with occasional NumPagesReuse. Active pages are approximately the same as the MemUsageKB reported by monCachedProcedures.
Anyone has any idea what may be causing this leak? Anyone experienced the same behaviour?
I’m quite concerned about this. May jeopardise the upgrade project altogether.
Insights will be appreciated.