Proximity Placement Groups in Azure : Testing the Impact on Query Performance in HANA
Network Latency in SAP Environments:
Network latency is a very important metric to be measured, which has a direct impact on runtime of jobs or transactions run by users. Many SAP applications ( custom and standard ) run queries as part of application logic which requires multiple round-trips to database sometimes based on the program design. Examples could be a SELECT … UPTO 1 ROWS in a loop etc.
A slight increase in latency can multiply as the number of times the loop is run and fetching records from database. During a migration of SAP to a IaaS or on-premise provider, it is difficult to identify these changes from an application level if the change is very insignificant per record, however it can multiply many fold if the logic is run in a loop.
If the change is too big, it could be picked up my many metrics from Solman like increase in DB response time etc,so it can be identified easily.
Case Study :
Below is a case where we noticed an increase in runtime of DTP loads in a BW on HANA system which is moved to Azure from an on-premise hosting provider. The change in time to read a record from database increased to 1ms from 0.5ms which is difficult to identify in network level monitoring tools.
The below approach and the tools can be used to identify similar performance regression arising out of network latency issues post migration to any Cloud provide.
** All the screenshots below are taken by me in my SAP environment.
- runtime of the jobs doubled where there’s more application logic.
- runtime of the jobs where most of the logic is at DB remained almost same.
What is Tested :
- Job runtime had doubled
Runtime of the job in Old Server which is still on-premise is around 1097 seconds.
Runtime of the job in Azure is around 2490 seconds.
2. ABAP Trace – this showed no changes as there’s no change in application logic.
ABAP Trace showing that the job is running a SELECT statement inside a loop and it executed around a million times. Though ABAP Trace pointed out with more Net % on SELECT queries, however, there are no changes on ABAP side as its just a lift and shift migration.
below is the ABAP Trace in on-premise server.
below is the ABAP Trace in Azure.
3. SQL Trace – this showed an increase in Time per execution and Average Time per Record.
SQL Trace in on-premise server with above KPIs highlighted.
SQL Trace in azure with the KPIs.
4. STAD – this showed an increase in average time per record in ms.
STAD showing the average time per record in ms is around 0.4 ms in on-premise system.
STAD showing the average time per record in ms is around 1ms in azure.
5. HANA Trace – execution time of the query is very quick when it is run directly on the database in both on-premise and azure.
# begin PreparedStatement_execute (thread 52737, con-id 253012) at 2020-09-18 08:59:50.300595
# con info [con-id 253012, tx-id 88, cl-pid 6860, cl-ip 10.60.49.9, user: SAPABAP1, schema: SAPABAP1]
cursor_139604749899776_c53012.execute(”’ SELECT * FROM “/BIC/ASHRD0100” WHERE “/BIC/ZOBJID2” = ? AND “/BIC/ZSCLAS” = N’P’ AND “/BIC/ZENDDA” >= ? ORDER BY “/BIC/ASHRD0100” . “REQUEST” , “/BIC/ASHRD0100” . “DATAPAKID” , “/BIC/ASHRD0100” . “RECORD” LIMIT 1 ”’, (u”’50000442”’, u”’20170308”’))
# end PreparedStatement_execute (thread 52737, con-id 253012) at 2020-09-18 08:59:50.300973
6. HANA PlanViz- execution plan of the query is same in both On-premise and Azure.
below is the execution plan and runtime in on-premise server.
below is the execution plan and runtime on azure.
As seen above, all the performance test results showed no regression in KPIs other than increase in time to access a record from database.
this pointed towards latency and did a test using ABAPMETER to verify if there is a latency to read from database.
ABAPMETER is part of ST13 Performance Tools and its a very useful tool to test performance. However, use it carefully and we recommend an expert user using the tools.
will write a detailed writeup on ABAPMETER soon.
So, the results of ABAPMETER are as below in azure without PPG ( Proximity Placement Groups )
after this test, our infrastructure team included both application server and database server in one PPG group.
after the change at infrastructure level is done, the job runtime is returned back to normal and average time to read a record had reduced by 50% ( same as on-premise ).
So, to conclude what are proximity placement groups, here is a brief summary from Microsoft.
“An Azure proximity placement group is a logical construct. When one is defined, it’s bound to an Azure region and an Azure resource group ” .
Source : https://docs.microsoft.com/en-us/azure/virtual-machines/workloads/sap/sap-proximity-placement-scenarios
So, in simple words, when we keep application server and DB server in one PPG, they stay in same data center and region reducing latency between them.
There is an SAP note which mentions the same which can be referred as well.
SAP Note : 2931465 – Reduce network latency (RTT) using Proximity placement groups on Microsoft Azure
So, to conclude, when we have application server and database on separate hosts, it is recommended to Azure Proximity Placement groups to reduce network latency between them.
Very well documented Sri!!
The guidance in the referenced SAP note (2931465) has changed significantly since 2020.
Use of PPGs should be a last resort to enable good performance. Instead, consider using the correct Zonal deployment and alignment model for your VMs and you will see that performance between PPG and non-PPG will not be so significant.
My testing shows that using Zonal deployment vs PPGs gives only around 10ms performance gain with PPGs.
Are there any negatives to using PPGs? Yes there are. If you deallocate a production system, it is possible that a capacity reservation issue in Azure can prevent you from allocating. No more production system until the allocation issue is resolved. Be careful. The same issue exists for non-prod.
YT => https://youtu.be/zy6q3FaQy_4