How often are we faced with this situation?  We are running various types of stress tests on a large environment.  All seems good.  All tests are working as planned.  All of a sudden DB freezes.  Nothing happens.  Bang!  For some mysterious reason our large DB has run out of log space and everything is stuck in log-suspend state.  No worth killing spids or dumping  tran manually.  It will freeze too.

Although some may argue that it never happens to good DBAs (are you nuts?  didn’t you hear about trunc on checkpoint & abort tran options! – some decent monitoring during the tests?) – sometimes these things happen and unpleasantly surprise us when we expect it the least (the last two time I faced this nasty situation were when I discovered how bad triggers impact may be on xOLTP type DMLs even though they do nothing – or how lazy the checkpoint process may be bcp-ing in data in 1k batches into trunc on chkpt set DB with 12 GB log).  Whatever the reason – it is always the nasty feeling that now I will have to rebuilt all this s…t again and waste hours on recovery.

How can this situation be treated, though, in the most efficient way?

The best & fastest option would be – to increase the log device and let ASE take care of itself.

The worst and slowest option would be – to crash the data server and curse ASE unable to start up.

Unfortunately the best option is not always possible (for various reasons) and the worst option… Hey!  Is it really the worst?

Often to recover from the situation like this one goes through the lengthy process of rebuilding the whole DB from scripts (listonly = ‘create_sql’ may sometimes help) and reloading it from dumps to preserve DB layout.  So a DBA kills the server… Starts it in no recovery mode (TFxxx)… Drops database with dbcc dbrepair(xxx)… Rebuilds it… Loads it back…  Day’s work goes down the sink.

There is a much quicker way, however, which is often overlooked.  To use the method above we sometime mark DB manually in sysdatabase as suspect (value 256).  But there is another more benign status which may help to speed up recovery in this situation.  If we know for sure that we will have to reload DB anyway – we may use the value 32 instead.  The status stands for:  Database created with for load option, or crashed while loading database, instructs recovery not to proceed.

That’s precisely what one needs:

dbcc dbreboot(reboot_norecovery, “LOGSUSPENDB”)

If this one works – you’re lucky, may proceed to reload the DB.  If it does not – you’ll have to crash ASE:

sp_configure “allow updates”, 1

update sysdatabases set status = 32 where name = “LOGSUSPENDB”

shutdown with nowait

startserver -f MY_NAUGHTY_ASE –(LOGSUSPENDB will not be recovered on startup)

load database LOGSUSPENDB from …

sp_configure “allow updates”, 0

sp_dboption … –reset the DB options

Unless someone may think of a better option to recover from this nasty situation I’d recommend the one above.

Love it or leave it – I leave it to you.

Have fun.

ATM.

ps.  Of course, good DBAs never get into this situation in the first place, but what is good DBA anyway?

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply