HANA Daily Monitoring Template
The below blog intend to help HANA admin to have a sample template that they can use for monitoring there HANA environment on pro-active basis.
you all are invited to add you comments if you feel we should include some more steps in the template . Its intention is monitoring not resolution , each issue reported can be resolved separately .You can use this template also as pointer for what all should be monitor if you are monitoring you SAP HANA landscape via SAP Solution Manager.
1. Check first if all the services are running fine :-
2.Run Unique checker (you can schedule it in you crontab also ,so as to get updates automatically in your mail box.
This program helps you to find duplicate entries in tables . Reach out to SAP to get the program or refer to https://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf if you do not have it.
3.Check for CRASH dumps if Any :-
check it on admin console –>Performance –>to see the dumps (OOM dumps as well) give the serach text as “dump”
if you find any crash dump –>analyze if its because of any query –>notify the query owner to optimize it in case if its causing dumps.
4. check SLT – if any table has error status
No error so all is good.
5. Check LTR as well :
Also check the traditional T-codes ST22 and SM21 , it should not have any critical dumps .
6. clean up the garbage memory:-
frequency could be everyday or once in 3 days you can decide after seeing the pattern :
execute mm gc -f
It triggers the garbage collector and without unloading the tables it free up memory .
Remark – to execute mm gc -f you need to log in HANA server –> HDBAdmin.sh–>Services–>console –>select the node –> execute the command.
7. Validate Backup – Successful backup taken on **/**/** . Next Back Up on **/**/**.
Analyze if the backup failed and take action accordingly .
Hope this template helps you to keep you HANA environment healthy and running 🙂 . Happy Monitoring .
Please add any step you feel should be part of daily monitoring task .
Nice blog and thanks for sharing.
Point 2 (uniquechecker) and 6 (garbage cleanup) are definitively not required nor recommended to run on a daily basis.
Also, HDBAdmin is not supported for any use outside SAP HANA development.
Concerning the validate backup: full agreement, the backups need to be checked.
But simply checking if they ran without error doesn't cut it.
To actually validate backups a recovery on a separate instance is necessary - otherwise you never know if you could actually perform the recovery.
thanks for the input.
i agree with you these two take lot of toll and time, so it should not be ran everyday.
But as said (in my case also ) it depends upon your requirement especially for unique checker . If you see frequent table corruption , so give filter on that particular table/s and run for it only rather than running for all table.
when you see frequent table corruption, you don't need the uniqueChecker. You need new hardware (or upgrade to a newer revision...)
Thanks for the valuable input,we are in process of doing the same ( as recommended by AGS to us) .
will update how it does after the Hardware upgrade.
Lars - to my other point - how can you really observe table corruption?
Corruption in data structures can be experienced in all sorts of ways:
For most cases SAP HANA should be able to figure out a corruption by itself and also "repair" the corrupted data by re-reading the last saved state from disk and re-applying all changes performed since (apply redo log).
I'ver added hdbbackupcheck to my backup script. No substitute for actually restoring on another instance, but verifies consistency.
1869119 - Checking backups using hdbbackupcheck
Can we include this command to the end of the backup script ?
Or we have to check backup for each backup files generated ?
Hi Lars/Vinay, I am just curious as to the garbage collector function. I searched all help documents and SCN and can't really find any documents that describe this in more detail. Does this "garbage" affect HANA globally or only specific tables? In what cases is "garbage" created?
Reason I am interested is that I was involved in a scenario last week where the performance on a few select tables degraded to a horrendous amount, whereas all other tables in the system were performing optimally. Even a SELECT COUNT(*) on the affected tables were taking upwards of 1 minute on 250 million records, where the same query on a 1.5 billion row table was 250ms. On checking the merge, column optimization, system load and all the "normal" methods to see what may be affecting performance I came up empty handed with no explanation. Miraculously at some later point, the specific tables starting performing normally, again - no action or explanation.
I am wondering if garbage collection could be an explanation or if there are other underlying "corruption" indicators to check for on specific tables.
sorry, too few details here to even make an educated guess.
"Garbage" data is all data we don't need any more. This includes old versions of data that once were current as well as temporary data and so forth.
The garbage collection works allocator and virtual file wise. E.g. LOB columns have their own separate memory handling.
All this happens automatically and typically no user interaction is required.
Thanks Lars. I guess the main thing I wanted to know is if the so called garbage (if left piled up the alley), would affect performance globally or if it would be on an object by object basis.
Nice Article. Thanks a lot. I will follow the script as stated. Until I can learn better.
This is good as a base line to start.
for reading and liking it .
Very Nice info...Thank You....Kindly keep more blogs.
Very nice article. Thanks.
Need your help on one issue,
following point 2, I have scheduled unique checker in crontab in HANA server with sidadm user, but script fails with below error:
Traceback (most recent call last):
File "uniqueChecker.py", line 8, in <module>
from hdbcli import dbapi
File "/HANA/sapmnt/<SID>/exe/linuxx86_64/HDB_1.00.53.375657_1048054/python_support/hdbcli/dbapi.py", line 15, in <module>
ImportError: No module named pyhdbcli.
However, I am able to run Unique checker manually on the server.
thanks Vinay .....very helpful info
there's a useful OSS Note on which compliments this blog and subject:
1977584 - Technical Consistency Checks for SAP HANA Databases
Thanks VInay it is very nice doc
Is there a way to schedule the garbage collection?
Garbage collection is triggered after a transaction is committed and also periodically (every hour by default). A transaction that is currently committing can be identified in the Threads tab (see System Performance
Analysis). The Thread type will be “SqlExecutor” and the Thread method “commit”.
The periodic garbage collection can be identified by Thread Type” MVCCGarbageCollector”.
Note that the periodic garbage collection interval can be configured in the indexserver.ini file transaction section with the parameter mvcc_aged_checker_timeout.
See SAP Note 2169283 for details about SAP HANA garbage collection including ways to trigger certain types of garbage collections.