ASE16 – sp_add_resource_limit…lock_count – Review
Just the last week we had a scenario where in we ran out of locks on one of our core servers. The worst part was it happened in a span of 16 seconds. We have configured for 2 mil locks. We have tons of monitoring in place and none of the monitors could pickup of what happened. We have a suspect but no clear reason of why when set with “set rowcount 2000” this specific update would grab 2 mil locks. Anyway, my intent is not to bore you with my operational issues. I am sure all old hands would have faced various problems with server running out of locks. We had a feature request years ago to have a resource limit for number of locks. Our prayers have been answered…at least partially.
The syntax for adding defining resource limits is as follows. The biggest problem that I see is at least username or appname have to be specified while defining this resource limit. If there are 10000 users it is going to be that many times we have to define and at the same time if there are 2000 applications then it is that many entries. Both of them are NOT something we can manage. It would have been ideal if we were given the max number of locks per SPID configuration param. As designed right now we cannot protect our servers from rouge sql statements blowing the configured number of locks.
sp_add_resource_limit <username>,<appname>, “at all times”, “lock_count”, 10000
Managing resource limits for 10,000 users isn't much harder than doing it for 10 if you script the process. The following query will produce an executable script that sets a limit on all users except SA.
select
"execute sp_add_resource_limit "
+ name
+ ", NULL, 'at all times', 'lock_count', 10000 "
from
master..syslogins
where name != "sa"
go
Of course you can.
Just have a monitor/watchdog process run as sa and poll every minute.
Of if you have some monitor/watchdog polling process add the logic.
If you are familiar with MDA (monXXX) tables then monProcessActivity is your friend.
Column of interest is LocksHeld. This shows locks held by a SPID.
If this number crosses a specified limit just record what what you need to about that SPID and terminate it. We have 1M as limit per SPID. Our server configuration at twice that number.
You should also write proper messages to your server error log about this and notify DBAs and affected end user about it.
HTH
Avinash
Avinash,
Thanks for your suggestion. We do collect data for monProcessActivity and have other controls also (locks from syslocks by spid). However there is not much you can do if all this is happening in the space of 16sec as was the case with our server last week.
Bret,
Our system accounts do bulk of the work and we can add controls easily for the system accounts. However, it is non-system accounts which are most often are responsible for rouge queries. In a fairly dynamic environment trying to maintain the resource limits by loginname is a challenge. So is the reason my recommendation is to for a sp_configure setting like 'pct of locks per spid',50 (of the locks configured) or even better if the current sp is enhanced to accept null values for both username and appname.
Thanks!
Prasad
Prasad
We were in similar situation.
16 seconds or 1 minute may be a long time on a powerful server.
But I found this approach worked for us.
Unlike syslocks MDA table does not need to be materialized every time so it does not need tempdb space at all. Sooo you can poll more frequently and calibrate. Also what I noticed was insert into temp tables from a huge active table with DRL lock scheme added up the locks very quickly. In your case what is SQL that is causing lock grab ?
Avinash
A couple of points
1 - ASE 16 adds monThresholdEvent - check it out
2 - MDA and syslocks both poll the full list of locks - so the behind the scenes work is the same. In some cases they function a bit different - e.g. if using MDA via a query in SQL, a work table may still be necessary
3 - set rowcount only affects the number of impacted rows - the number of scanned rows can be considerably higher. For example, a common case of this is where someone is lazy and uses 'ignore dupe rows' and does an insert select .......you actually may scan 100000 rows and have 50000 rows inserted with set rowcount 1000....you will also have 49000 CLRs to undo 49000 dupes of the 50000 rows inserted. How many locks did you need??? Depends....iso3 on the select???
Jeff
Would a simple query like "select (set of columns) from For monProcessActivity where LocksHeld > 1000000' still need work table ? I think not.
I can understand that order by or further join(s) may need work tables but to zero down on a SPID with a simple where filter clause will not.
We are in process of upgrading to ASE 16 so I look forward to using new set of MDA tables eventually.
Avinash
lock_count is not documented. Are there other undocumented limits available?
monThresholdEvent works only on resource limits? Log threshold is not displayed there I guess? monThresholdEvent sounds good as it is supposed to display SQLText and the LineNumber in batch/proc for the query crossing the limit...
Look in the NFG for lock count. ....and yes, only resource govenor limits appear here - sorry.
Ok. It is in SP02... Thanks...