Managing HANA systems is not simple due to many reasons. It is mainly due to the evolving complexity of HANA architecture (single or multi nodes, MDC or not, row or columns stores, master/slave services, memory management, xsa services, etc.).
Depending on the use cases, it is mandatory to gain the required knowledge or training on the new aspects of HANA. For example, your company decides to develop custom web apps on Hana (using XSA), you need to figure out how to implement and integrate it in your landscape (with security, scalability, backups,… concerns). Another example could be the integration of an application hosted in SCP.
To help customers better manage their systems, SAP provide tons of resources and information, whether there are available in more than 300+ pages guides, in FAQ type OSS notes, in training material or online tutorial for the basic part. But most of the time, among all this information and the difficulty to access the right info at the needed time, we find ourselves searching for answers on Google…
In an effort to provide some help to the SAP IT teams, here under is my work in progress of tips, links, or SQL queries gathered from SAP sources or when working on my SAP HANA startup, that can help or at least should be aware of.
I will enrich this knowledge base with my experiences and your feedbacks (feel free to comment).
Few SAP references to start:
Tips & Queries:
During the uptime of HANA systems, the Row Store is used for various allocation requests. With unloads from the memory, fragmentation happens. Use this link to estimate how much you can get with an online or offline execution. Refer to OSS note 1813245 — SAP HANA DB: Row store reorganization.
SELECT HOSTNAME, PORT, SERVICE_NAME, GB_RS_ALLOC_SIZE, GB_RS_FREE_SIZE, GB_RS_MAX_SAVING_MEM, GB_RS_MAX_SAVING_LOG_VOLUME from ( SELECT M.HOST HOSTNAME, M.PORT, S.SERVICE_NAME , CASE WHEN ( ( (SUM(M.FREE_SIZE) / SUM(M.ALLOCATED_SIZE)) > 0.30 ) AND SUM(M.ALLOCATED_SIZE) > TO_DECIMAL(10) * 1024 * 1024 * 1024 ) THEN ‘TRUE’ ELSE ‘FALSE’ END filter, TO_DECIMAL(SUM(M.ALLOCATED_SIZE) / 1048576 / 1024, 10, 2) GB_RS_ALLOC_SIZE, TO_DECIMAL(SUM(M.FREE_SIZE) / 1048576, 10, 2) GB_RS_FREE_SIZE, TO_DECIMAL(R.FREE_SIZE / 1024768 / 1024, 10, 2) GB_RS_MAX_SAVING_MEM, TO_DECIMAL(R.FREE_SIZE / 1024768 / 1024 * 1.5, 10, 2) GB_RS_MAX_SAVING_LOG_VOLUME FROM M_RS_MEMORY M INNER JOIN M_SERVICES S ON M.PORT = S.PORT INNER JOIN M_RS_MEMORY R ON M.PORT = R.PORT WHERE ( ( M.CATEGORY = ‘TABLE’ OR M.CATEGORY = ‘CATALOG’ ) AND R.CATEGORY = ‘TABLE’ ) GROUP BY M.HOST, M.PORT, S.SERVICE_NAME, R.FREE_SIZE ) where filter = ‘TRUE’
SELECT TOP 20 “SCHEMA_NAME”, sum(“DISTINCT_COUNT”) RECORD_COUNT, sum(“MEMORY_SIZE_IN_TOTAL”) “COMPRESSED_SIZE”, sum(“UNCOMPRESSED_SIZE”) UNCOMPRESSED_SIZE, (sum(“UNCOMPRESSED_SIZE”)/sum(“MEMORY_SIZE_IN_TOTAL”)) as COMPRESSION_RATIO FROM “SYS”.”M_CS_ALL_COLUMNS” WHERE MEMORY_SIZE_IN_TOTAL > 0 GROUP BY “SCHEMA_NAME” having sum(“DISTINCT_COUNT”)>0 ORDER BY COMPRESSION_RATIO DESC
select TOP 20 “TABLE_NAME”, sum(“DISTINCT_COUNT”) RECORD_COUNT, sum(“MEMORY_SIZE_IN_TOTAL”) “COMPRESSED_SIZE”, sum(“UNCOMPRESSED_SIZE”) UNCOMPRESSED_SIZE, (sum(“UNCOMPRESSED_SIZE”)/sum(“MEMORY_SIZE_IN_TOTAL”)) as COMPRESSION_RATIO FROM “SYS”.”M_CS_ALL_COLUMNS” WHERE MEMORY_SIZE_IN_TOTAL > 0 GROUP BY “TABLE_NAME” ORDER BY COMPRESSION_RATIO DESC
select S.service_name || ‘-’ || S.PORT as “SERVICE”, COUNT(IFNULL(C.CONNECTION_STATUS, ‘IDLE’)) as “NB_ACTIVE_CONNECTIONS” FROM SYS.M_CONNECTIONS C inner join SYS.M_SERVICES S on C.PORT = S.PORT WHERE C.LOGICAL_CONNECTION_ID = C.CONNECTION_ID AND C.OWN = ‘FALSE’ group by S.service_name || ‘-’ || S.PORT
select ‘LAST_5MIN’ as “C1”, count(*) AS “NB_MERGES”, sum(merged_delta_records) AS “NB_MERGED_REC”, AVG(EXECUTION_TIME) AS “SEC_AVG_MERGE”, max(EXECUTION_TIME) AS “SEC_MAX_MERGE” FROM _SYS_STATISTICS.HOST_DELTA_MERGE_STATISTICS WHERESTART_TIME >= ADD_SECONDS (CURRENT_TIMESTAMP, -300)
SELECT count(*), AVG(SECONDS_BETWEEN(ACQUIRED_TIME, CURRENT_TIMESTAMP)) “AVG-OBJ-LOCK-AGE”, MAX(SECONDS_BETWEEN(ACQUIRED_TIME, CURRENT_TIMESTAMP)) “MAX-OBJ-LOCK-AGE” FROM M_OBJECT_LOCKS
SELECT count(*), AVG(SECONDS_BETWEEN(ACQUIRED_TIME, CURRENT_TIMESTAMP)) “AVG-REC-LOCK-AGE”, MAX(SECONDS_BETWEEN(ACQUIRED_TIME, CURRENT_TIMESTAMP)) “MAX-REC-LOCK-AGE” FROM M_RECORD_LOCKS
SELECT HOST, ROUND(USED_PHYSICAL_MEMORY/1024/1024/1024, 2) AS “Resident GB”, ROUND((USED_PHYSICAL_MEMORY + FREE_PHYSICAL_MEMORY)/1024/1024/1024, 2) AS “Physical Memory GB” FROM PUBLIC.M_HOST_RESOURCE_UTILIZATION
SELECT SCHEMA_NAME AS “Schema”, ROUND(SUM(MEMORY_SIZE_IN_TOTAL) /1024/1024) AS “MB Used” FROM M_CS_TABLES GROUP BY SCHEMA_NAME ORDER BY “MB Used” DESC
SELECT USAGE_TYPE as “DATA TYPE”, LPAD(TO_DECIMAL(USED_SIZE / 1024 / 1024 / 1024, 10, 2), 12) AS “GB-DISK-USED”, LPAD(TO_DECIMAL(MAP(TOTAL_SIZE, 0, 0, USED_SIZE / TOTAL_SIZE * 100), 10, 2), 😎 AS “PCT-DISK-USED”, LPAD(TO_DECIMAL(TOTAL_SIZE/1024/1024/1024, 10, 2), 12) AS “GB-DISK-TOTAL”FROM M_DISKS
KPI that shows potentials misconfiguration:
Counters for buffer switches indicate performance of the in-memory log buffers. Normally, buffer switching happens without any waits. In the case of buffer full, however, a wait is necessary. If the wait ratio is higher than one percent 0.01, this indicates a possible misconfiguration of the system.
SELECT B.HOST as HOSTNAME, S.SERVICE_NAME ||’-’||S.PORT AS “SERVICE”, SWITCH_WAIT_COUNT as service_log_buffer_wait_count__nb, SWITCH_NOWAIT_COUNT as service_log_buffer_no_wait_count__nb, SWITCH_WAIT_COUNT/SWITCH_NOWAIT_COUNT as ratio FROM M_LOG_BUFFERS B INNER JOIN M_SERVICES S ON B.PORT=S.PORT
HANA uses asynchronous IO to write to and read from disk. Triggering an asynchronous IO should return immediately. When the request is finished, a callback is called. The time needed for triggering an IO request is called TRIGGER_TIME. The time for completely finishing an IO request is called IO_TIME. The ratio TRIGGER_TIME / IO_TIME is called TRIGGER_RATIO.
Ideally, asynchronous IO should have a trigger ratio close to zero because merely triggering a request should take much less time than actually executing this request. A trigger ratio close to 1 indicates asynchronous IO that behaves almost like synchronous IO, that is: triggering an IO request takes just as long as executing it.
Some activities on HANA can degrade the R/W trigger ratio as they are by nature very consuming (savepoints, commits, massive loading of tables).
To detect problem on this field, execute this query for results (no results = no problem), check OSS note 1930979 — Alert: Sync/Async for corrective actions.
select HOST, PORT, TYPE, PATH, CONFIGURATION, TRIGGER_WRITE_RATIO, TRIGGER_READ_RATIO from SYS.M_VOLUME_IO_TOTAL_STATISTICS where TRIGGER_WRITE_RATIO >= 0.5 OR TRIGGER_READ_RATIO >= 0.5
select c.host, c.user_name, c.connection_status, c.transaction_id, s.last_executed_time, round(s.allocated_memory_size/1024/1024/1024,2) as “Alloc Mem (GB)”, round(s.used_memory_size/1024/1024/1024,2) as “Used Mem (GB)”, s.statement_string from m_connections c, m_prepared_statements s where s.connection_id = c.connection_id and c.connection_status != ‘IDLE’ order by s.allocated_memory_size desc
Useful views:
Thanks for your time. Do not hesitate to share this and provide yours tips.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
8 | |
5 | |
5 | |
4 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 |