The below blog intend to help HANA admin to have a sample template that they can use for monitoring there HANA environment on pro-active basis.


you all are invited to add you comments if you feel we should include some more steps in the template . Its intention is monitoring not resolution , each issue reported can be resolved separately .You can use this template also as pointer for what all should be monitor if you are monitoring you SAP HANA landscape via SAP Solution Manager.



1. Check first if all the services are running fine :-


/wp-content/uploads/2014/04/pic1_426110.png

2.Run Unique checker (you can schedule it in you crontab also ,so as to get updates automatically in your mail box.

  This program helps you to find duplicate entries in tables . Reach out to SAP to get the program or refer to         https://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf  if you do not have it.

3.Check for CRASH dumps if Any :-

check it on admin console –>Performance –>to see the dumps (OOM dumps as well) give the serach text as “dump”

if you find any crash dump –>analyze if its because of any query –>notify the query owner to optimize it in case if its causing dumps.


4. check SLT – if any table has error status



/wp-content/uploads/2014/04/pic2_426112.png


No error so all is good.

5. Check LTR as well :

/wp-content/uploads/2014/04/pic3_426113.png

Also check the traditional T-codes ST22 and SM21 , it should not have any critical dumps .




6. clean up the garbage memory:-

frequency could be everyday or once in 3 days you can decide after seeing the pattern :

execute mm gc -f


It triggers the garbage collector and without unloading the tables it free up memory .

Remark – to execute mm gc -f  you need to log in HANA server –> HDBAdmin.sh–>Services–>console –>select the node –> execute the command.

7. Validate Backup – Successful backup taken on **/**/** . Next Back Up on **/**/**.

Analyze if the backup failed and take action accordingly .

Hope this template helps you to keep you HANA environment healthy and running 🙂 . Happy Monitoring .

Please add any step you feel should be part of daily monitoring task .

To report this post you need to login first.

24 Comments

You must be Logged on to comment or reply to a post.

  1. Lars Breddemann

    Hi there!

    Nice blog and thanks for sharing.

    Few comments:

    Point 2 (uniquechecker) and 6 (garbage cleanup) are definitively not required nor recommended to run on a daily basis.

    Also, HDBAdmin is not supported for any use outside SAP HANA development.

    Concerning the validate backup: full agreement, the backups need to be checked.

    But simply checking if they ran without error doesn’t cut it.

    To actually validate backups a recovery on a separate instance is necessary – otherwise you never know if you could actually perform the recovery.

    – Lars

    (0) 
    1. singh vinay Post author

      hi Lards,

      thanks for the input.

      i agree with you these two take lot of toll and time, so it should not be ran everyday.

      But as said (in my case also ) it depends upon your requirement especially for unique checker . If you see frequent  table corruption , so give filter on that particular table/s and run for it only rather than running for all table.

      regards,

      vinaysingh

      (0) 
      1. Lars Breddemann

        Singh,

        when you see frequent table corruption, you don’t need the uniqueChecker. You need new hardware (or upgrade to a newer revision…)

        – Lars

        (0) 
        1. singh vinay Post author

          Hi Lars,

          Thanks for the valuable input,we are in process of doing the same ( as recommended by AGS to us) .

          will update how it does after the Hardware upgrade.

          regards,

          vinay singh

          (0) 
          1. Lars Breddemann

            Hi Justin

            Corruption in data structures can be experienced in all sorts of ways:

            • wrong data, no error message
            • wrong data and error messages
            • error messages
            • system crashes

            For most cases SAP HANA should be able to figure out a corruption by itself and also “repair” the corrupted data by re-reading the last saved state from disk and re-applying all changes performed since (apply redo log).

            – Lars

            (0) 
    2. Jake Echanove

      I’ver added hdbbackupcheck to my backup script. No substitute for actually restoring on another instance, but verifies consistency.

      1869119 – Checking backups using hdbbackupcheck

      (0) 
      1. Rabindra Das

        Hi Jake

        Can we include this command to the end of the backup script ?

        Or we have to check backup for each backup files generated ?

        Thanks

        Rabi

        (0) 
    3. Justin Molenaur

      Hi Lars/Vinay, I am just curious as to the garbage collector function. I searched all help documents and SCN and can’t really find any documents that describe this in more detail. Does this “garbage” affect HANA globally or only specific tables? In what cases is “garbage” created?

      Reason I am interested is that I was involved in a scenario last week where the performance on a few select tables degraded to a horrendous amount, whereas all other tables in the system were performing optimally. Even a SELECT COUNT(*) on the affected tables were taking upwards of 1 minute on 250 million records, where the same query on a 1.5 billion row table was 250ms. On checking the merge, column optimization, system load and all the “normal” methods to see what may be affecting performance I came up empty handed with no explanation. Miraculously at some later point, the specific tables starting performing normally, again – no action or explanation.

      I am wondering if garbage collection could be an explanation or if there are other underlying “corruption” indicators to check for on specific tables.

      Regards,

      Justin

      (0) 
      1. Lars Breddemann

        Hi Justin

        sorry, too few details here to even make an educated guess.

        “Garbage” data is all data we don’t need any more. This includes old versions of data that once were current as well as temporary data and so forth.

        The garbage collection works allocator and virtual file wise. E.g. LOB columns have their own separate memory handling.

        All this happens automatically and typically no user interaction is required.

        – Lars

        (0) 
        1. Justin Molenaur

          Thanks Lars. I guess the main thing I wanted to know is if the so called garbage (if left piled up the alley), would affect performance globally or if it would be on an object by object basis.

          Regards,

          Justin

          (0) 
  2. Randy Middleton

    Nice Article.  Thanks a lot.  I will follow the script as stated. Until I can learn better.

    This is good as a base line to start.

    Thanks again.

    (0) 
  3. Kartik Kumar

    Hello Vinay,

    Very nice article. Thanks.

    Need your help on one issue,

    following point 2, I have scheduled unique checker  in crontab in HANA server with sidadm user, but script fails with below error:

    Traceback (most recent call last):

      File “uniqueChecker.py”, line 8, in <module>

        from hdbcli import dbapi

      File “/HANA/sapmnt/<SID>/exe/linuxx86_64/HDB_1.00.53.375657_1048054/python_support/hdbcli/dbapi.py”, line 15, in <module>

        import pyhdbcli

    ImportError: No module named pyhdbcli.

    However, I am able to run Unique checker manually on the server.

    (0) 
    1. KRISHNA YALAMANCHI

      http://help.sap.com/hana/sap_hana_troubleshooting_and_performance_analysis_guide_en.pdf



      Garbage collection is triggered after a transaction is committed and also periodically (every hour by default). A transaction that is currently committing can be identified in the Threads tab (see System Performance

      Analysis). The Thread type will be “SqlExecutor” and the Thread method “commit”.

      The periodic garbage collection can be identified by Thread Type” MVCCGarbageCollector”.

      Note that the periodic garbage collection interval can be configured in the indexserver.ini file transaction section with the parameter mvcc_aged_checker_timeout.

      (0) 

Leave a Reply