HANA Database Terminate long running backup process
Hi,
I have been working on HANA Systems for more than an year now and came across this situation with a long running log backup process on one of our Production HANA instances that needed considerable amount of troubleshooting and decided to BLOG about it
BACKGROUND:
As such HANA backups are different from traditional Database backups. While we are required to configure log backups on other RDBMS after setting the Database in FULL recovery model(example MSSQL) in HANA the log backups are configured automatically as soon as we have HANA system set in log_mode= NORMAL mode
This functionality is helpful for Administrators as it does not need additional configuration however it also keeps Administrators away from regular troubleshooting in case we have an issue with the log backups
Here is our SCENARIO:
We are on HANA SPS09(revision 97) MDC setup with multiple TENANT Database’s and use a BACKINT solution for backing the data and log backups to an external backup appliance
On one of the TENANT Database’s we had an issue overnight that caused the log backup on a database go into a hung state and the issue went unnoticed until next morning. In the next morning we started troubleshooting the issue and noticed multiple alerts in the log for long running log backup process
We could not find any Database process that could be terminated on HANA and after looking through some SAP notes we identified some troubleshooting steps under below SAP Note 2083715–> #7 to identify the PID and terminate the process on Linux side
http://service.sap.com/sap/support/notes/2083715
As per the instructions we identified the PID from the backing.log available under the HANA Studio–> Diagnosis Files
The PID was identified to be 184492, however there was no such PID active on the HANA system from Linux side
With more than 15 hours since the last log backup and the log segments usage growing we were looking for a planned restart to terminate the process but it cannot be justified to restart HANA for terminating a long running log backup session
We spent some more time troubleshooting and identified that the #PIPE number indicated in the above log is the exact #PIPE that BACKINT process initiates while launching the backup, with this tip we identified the path on Linux where the PIPE is seen and identified a hung PIPE that was never terminated
(if you keep listing the files under the BACKINT path every few minutes when a new log backup is launched a PIPE is seen temporarily and it disappears after the backup completes)
We did a rename of this *log_backup____* file to check if it could be changed and finally issue a *rm* command to remove the file and the long running backup process failed immediately. This resolved the issue and we did not have to try a restart of the HANA Database
Note: Terminating a HANA Data backup is straight forward by identifying the BACKUP ID and using the query BACKUP CANCEL <BACKUP ID>
Use the below link for more details
http://help.sap.com/saphelp_hanaplatform/helpdata/en/c4/f934abbb571014b5fec3c1121b4dad/content.htm
Hope this helps…..
Sunil
very informative!
Hello Sunil
Thank you very much for your blog. I have solved a similar problem with your instructions.
Regards. Angel
Good to know the post is useful
Sunil
killing the prole process in case of TSM backint can also help