This post provides information on the key troubleshooting issues you might encounter while using the Fault Manager, and the various diagnostic and monitoring tools you can use to fix them. It also details recommendations on configuring your Fault Manager and SAP Host Agent. The post includes the following:
— Troubleshooting HADR System/Fault Manager Issues
— Miscellaneous Issues
— Recommendations


Troubleshooting HADR System/Fault Manager Issues

When the Root Partition is Full
On one of the hosts running the primary or companion servers, the Fault Manager heartbeat log file (dev_hbeat) may grow very large in size, and as a result, the host’s root partition fills up and the asehostctrl command fails.
Resolution: Use the following command to check the size of the dev_hbeat file to determine if the increased file size is causing the failure:
sudo du -sh /usr/sap/hostctrl/work/dev_hbeat
16G /usr/sap/hostctrl/work/dev_hbeat

To resolve this issue, delete the dev_hbeat file. If the dev_hbeat file does not consume much space, you might want to check other files on the partition.

When the ASE Cockpit Frequently Displays Timeout Messages
This indicates that the sapdbctrl calls from the Fault Manager are timing out.
Resolution: Increase the timeout period for sapdbctrl by increasing the value for the ha/syb/dbctrl_timeout parameter in the Fault Manager Profile file. The default value of the parameter is 30 seconds. After you have made the necessary changes, restart the Fault Manager using the restart command:
$SYBASE/FaultManager/bin/sybdbm restart

When Fault Manager Calls to the SAP Host Control Fail
Resolution: Refer to the following logs and search for the errors:
— Fault Manager log (<installation_directory> /log/FaultManager.log)
— SAP Host Agent log (/usr/sap/hostctrl/work/dev_sapdbctrl file)
Generally, start with the Fault Manager log and check for the command that has failed. For example, if you are suspecting that the error is caused by system heartbeat failure, in the Fault Manager log, search for TASK = HEARTBEAT_CHECK. Now search for the text HEARTBEAT_CHECK in the SAP Host Agent log for the same timestamp. For correct diagnostic, ensure that the system clocks of the Fault Manager host and the SAP Host Agent are in sync. It’s recommended to use trace level 3 (for maximum verbose output) while debugging SAP Host Agent issues.
The SAP Host Agent is a software component that can accomplish many lifecycle management tasks, such as operating system monitoring, database monitoring, system instance control and so on. It contains several sub-modules, including the SAP Host Control. The SAP Host Control runs within the SAP Host Agent under the sapadm user. For more information, refer to the SAP Host Agent architectural overview.

Error While Stopping the Fault Manager
While using the stop command to shut down the Fault Manager, you see this message:
fault manager did not change to mode UNKNOWN within 60 seconds. fault manager running, pid = 15922, fault manager overall status = OK, currently executing in mode DIAGNOSE
Resolution: Re-execute the stop command. Don’t stop the Fault Manager using the kill -9 operating system command.

The sybdbfm Utility Displays a “No Fault Manager Found” Message
When using the sybdmfm utility, you may see this message:
no fault manager found for current working directory error: stop failed.
Most likely, you are not running the sybdbfm command from the directory where the profile file and other Fault Manager-generated files (such as sp_sybdbfm and stat_sybdbfm) are located.
Resolution: Re-execute the sybdbfm command from the directory where these files are located.

Replication status Messages
Though the primary and companion HADR nodes are healthy (when db host and db status is OK), the sanity report still displays the replication status as one of following:
DEAD
SUSPENDED
UNKNOWN
ASYNC_OK
Resolution: Refer to the Replication Server error logs for information.

Fault Manager Could Not Create a Connection to the Host Agent
The Fault Manager error log indicates (as shown below) that the Fault Manager could not create a connection to the Host Agent.
***LOG Q0I=> NiPConnect2: 10.172.162.61:1128: connect (111: Connection refused)
[/bas/CGK_MAKE/src/base/ni/nixxi.cpp 3324]
*** ERROR => NiPConnect2: SiPeekPendConn failed for hdl 6/sock 6
(SI_ECONN_REFUSE/111; I4; ST; 10.172.162.61:1128) [nixxi.cpp 3324]

Resolution: Check if the sapstartsrv process is running by executing the following command:
ps -aef | grep sapstartsrv
Normally, when the SAP Host Agent is started, the sapstartsrv process starts automatically with it. If the sapstartsrv process is not running already, you need to start it, then re-start the SAP Host Agent.

Miscellaneous Issues

  • Ensure that you have write permissions for the SAP ASE installation directory, the Fault Manager installation and execution directories, and the /tmp directory. The Fault Manager creates temporary directories under /tmp, and adds temporary files. In the absence of appropriate permissions, SAP Host Agent calls fail. Also, it’s important to prevent the /tmp directory from becoming full. If /tmp is full, the Fault Manager cannot create temporary files. Check the status of /tmp by executing the df -k /tmp command. If this command shows 100 percent usage, make room in /tmp.
  • Verify that the GLIBC (GNU C Library) version is 2.7 or later. The Fault Manager is built with GLIBC version 2.7, therefore the hosts running it must use GLIBC version 2.7 or later. Use the following command to check the GLIBC version:
    ldd –version
  • Make sure you enter the correct passwords for sa, DR_admin, and sapadm.
  • Set the appropriate value for file descriptors: A file descriptor is an integer number that uniquely represents an opened file in the operating system. Verify that the user limit value (file descriptor) for open files is set to an adequate number (4096 or more) before you configure the HADR system for large databases.
    To determine the number of file descriptors to which your system is set, enter the following command:

    • For C-shell: limit descriptors
    • For Bourne shell: ulimit –n

    To change the value for the file descriptor (for instance, 4096), enter:

    • For C-shell: limit descriptors 4096
    • For Bourne shell: ulimit –n 4096

    Recommendations

    Increase the Trace Level for Troubleshooting
    Set the trace level (essentially, the level of detail in the error log) to its highest level on the SAP Host Agent and the Fault Manager so your error log output is as detailed as possible.

    • For the Fault Manager: Set the value of the trace level for the ha/syb/trace parameter in the profile file (SYBHA.PFL), then restart the Fault Manager (using the $SYBASE/FaultManager/bin/sybdbm restart command). For example, to get the maximum verbose information, set the trace level to 3 by adding the line ‘ha/syb/trace = 3’ to SYBHA.PFL file. The SYBHA.PFL file is located in the installation directory of the Fault Manager on all platforms. Increasing the trace level increases the number of log entries, and may increase the file size. You may choose one the following values for the ha/syb/trace parameter:
      • 1 – Basic verbose output
      • 2 – Medium verbose output
      • 3 – Maximum verbose output
    • For the SAP Host Agent: Set the trace level in the profile file, and restart the SAP Host Agent using the saphostexec program. For example, to get the maximum verbose output, add the line service/trace = 3 to the host profile (/usr/sap/hostctrl/exe/host_profile). The profile file is located in:
      • (UNIX): /usr/sap/hostctrl/exe/host_profile
      • (Windows): %ProgramFiles%\SAP\hostctrl\exe\host_profile1
To report this post you need to login first.

7 Comments

You must be Logged on to comment or reply to a post.

  1. Fernando Pardo

    Hi,

     

    I am having an error installing Fault Manager:

     

    – Root user

    – ldd version 2.11

    -Linux SUSE 12

    – hostagent 721 patch23

    -Installing in a different host than ERP1 and ERP2 (Primary and standby SAP Servers)

    -ulimit 4096

    ERROR

    2017 01/27 17:37:54.824 (11876) loading executable /usr/sap/SYB/SYS/exe/run/sybdbfm for heartbeat to SAPHostAgent tools.
    2017 01/27 17:37:54.824 (11876) upload executable /usr/sap/SYB/SYS/exe/run/sybdbfm.
    2017 01/27 17:37:54.824 (11876) ERROR: cannot open file /usr/sap/SYB/SYS/exe/run/sybdbfm for read.
    2017 01/27 17:37:54.824 (11876) bootstrap failed.
    2017 01/27 17:37:57.825 (11876) start bootstrap.

     

    I dont understand why it is asking for SAP directory , on the other hand SID for ERP is PRD not SYB.

    Finally I couldn´t find fault_manager_responses.txt in $SYBASE/log directory (secondary server)

    Any clue?

    (0) 
    1. Maria-Cristina NORMAND

      Hello,

      This blog refers to HADR for SAP ASE for custom applications, so it does not apply to HADR for SAP ASE for Business Suite.

      Note that in a Business Suite environment, Fault Manager is currently not supported, as stated in SAP Note  1891560 – Disaster Recovery Setup with SAP Replication Server :


      General Limitations for SAP Replication Server 15.7.1:

      SAP Netweaver Business Warehouse (BW) or systems using SAP BW features like SAP SCM APO, SAP SEM, and SAP Solution Manager are currently not supported.

      SAP Replication Server 15.7.1 SP200 and higher is not supported for SAP ASE 15.7.
      SRS SP200 and higher requires SAP ASE 16.0 as a minimum. The versions that are supported for SAP ASE 16.0 are specified below.

      Important: Fault Manager is not supported for HADR for Business Suite environments.


      Regards,

      Cris

       

       

       

      (0) 
      1. Fernando Pardo

        Hi Cris,

         

        I didn’t notice this limitation when I checked the note.

         

        I have this versions>

        ASE                                    SRS

        16.0 SP02 PL05 HF1 15.7.1 SP305 supported

         

        I can see in ASE Cockpit both servers with its status green(primary) and grey(stand by) and replication works fine but if Fault Manager is not supported what tool should I use? or what’s next?

         

        Regards.

         

         

        (0) 
        1. Maria-Cristina NORMAND

          Hello Fernando,

          There are still issue preventing Fault Manager to be supported for the Business Suite, even if ASE 16 SP02 PL05 HF1 is supported for HADR.

          DBA Cockpit is the recommended tool when running SAP applications on SAP ASE, HADR options have been enhanced there. ASE Cockpit has not been specifically designed for ASE for Business Suite, and usually customers running SAP Applications on ASE are not even aware of its existence 🙂

          The advantage of the Fault Manager is that it monitors the health of the components of an HADR environment (ASE, SRS, RMA) for you and will take actions automatically depending on the health. Without it, you can still setup your HADR environment, monitor and take the actions needed.

          HTH

          Regards,

          Cris

          (1) 
        2. Rajendra Shriramula

          Hi Fernando,

          can you please tell me which tool or software you use for auto fail over for ASE HADR for Business suite. I have also configured ASE HADR for business suite and looking some mechanism to auto-fail over this

          Thanks

          Abu

          (0) 
  2. lisa c
    • SAP Solution Manager & SAP Netweaver Business Warehouse (BW) or systems using SAP BW features are currently not supported.
    • SAP ASE 15.7. do not support SAP Replication Server 15.7.1 SP200
      SAP ASE 16.0 is required as a minimum for SRS SP200

    Thanks

    Lisa C | Customer Success Manager

    7600 Dublin Blvd #210
    PH: (877) 895-9163 | C: (770) 393-3234

    Drivers Update Windows 10

    (0) 

Leave a Reply