Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Many customers operate SAP NetWeaver based products in a Windows Failover Cluster and many of You asked for a troubleshooting guide for cluster related problems.

This blog lists error messages of SAP's Windows Failover Cluster DLL (saprc.dll) and gives hints and solutions for known problems of SAP software in Failover Clusters.

If you see a related event from saprc.dll in the Windows application log, just click the URL in the event description and you will be redirected to this document, pointing to the related error code. You will then find a more detailed problem description and solutions to solve the issue.

This guide refers to saprc.dll with file version > 1.0.
If you use an older saprc.dll (with saprcex.dll and sapclus.dll) then this document does not apply.

Common How To Troubleshooting Steps


Question: I have installed a new Windows Failover Cluster, how can I register SAP Resource Types (I am not using SWPM (SAPInst))?

Answer: To register SAP cluster Resource Types run: insaprct.exe -reg


To unregister run: insaprct.exe -unreg


 

Question: Where can I find information, warnings and error messages from saprc.dll?

Answer: Look for saprc entries in Windows Event Viewer (eventvwr.exe):




  • Open Application event log and set a filter on event sources = “SAP cluster”.

  • dll may trigger event ID=12345, 35306


 

Question: Where can I find the cluster node’s log and how can I read it?

Answer: Open a PowerShell with administrative rights and run:


get-clusterlog -destination <directory> -uselocaltime -timespan <minutes>


The command will retrieve all the cluster.log from all cluster nodes and will store them in a folder specified by <directory>.
The -uselocaltime switch uses the local Windows time settings, otherwise all timestamps in the cluster log will be displayed using UTC timezone.
The -timespan parameter should be set to a time period, where the error occurred.


The cluster.log can be read with any Windows editor, in this example we use notepad to read it:


Example:



PS C:\temp> get-clusterlog -destination c:\temp -uselocaltime -timespan 60

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 10.04.2017 12:13 2934310 wsiv1006-1_cluster.log
-a---- 10.04.2017 12:12 8695784 wsiv1006-2_cluster.log

PS C:\temp> notepad .\wsiv1006-1_cluster.log

Hint: When reading a cluster.log, first go to the end of the log and step forward to find the latest entries.


 

Question: Which network ports are being used by SAP software?

Answer: To view the current port usage run: insaprct.exe -ports


You will get an overview of all used ports and the current settings of the dynamic port ranges.


 

Question: SAP support asks for a crash dump? Where can I find it?

Answer: SAP Resource DLL (saprc.dll) may create one or more crash dumps in case of an unexpected failure. Dumps are created in the "temp" directory.


Example:




  • C:\Windows\Temp\_saprc_01.dmp

  • C:\Windows\Temp\_sapstartsrv_04.dmp


When a dump is created, also a message is written in the Windows Application event log with the path of the dump file path and some additional information.


To provide all necessary data for SAP support, collect this:




  • *.dmp dumps above

  • saprc.dll

  • saprc.pdb

  • cluster log from the affected cluster node

  • application event log from the affected cluster node


Zip all files together and send the zipped file to SAP support.







Specific ntha error messages


In this section you find the error codes from saprc.dll. Look in Windows event log for events of "SAP cluster" source or generate cluster.log to find the related entries.


ntha-160


Cluster.log shows this:

000000329 Error 15:16:15.560 #5928.6908 [SAP NCC 00 Instance] ntha-160 SAPNCC_00 soap start error: 28, The wait operation timed out., WaitForSingleObject failed in plugin_pipefrecv()

Problem: SAP Resource fails to start.

Solution: Investigate and fix an issue with SAP system.


ntha-200


Cluster.log shows errors in range 2xx like these examples:

000000227 Info 13:22:30.451 #5536.7648 [SAP NCC 00 Instance] ntha-204 SAPNCC_00 before soap ping

000000228 Error 13:23:00.470 #5536.7648 [SAP NCC 00 Instance] ntha-205 SAPNCC_00 soap error: 28 [The wait operation timed out., WaitForSingleObject failed in plugin_pipefrecv()], count: 1

000000229 Error 13:23:00.471 #5536.7648 [SAP NCC 00 Instance] ntha-199 SAPNCC_00 is not alive, IsAlive will time out in 175 sec

000000230 Info 13:23:00.481 #5536.7648 [SAP NCC 00 Instance] ntha-201 SAPNCC_00 end instance #9 IsAliveInternal: false

000000243 Error 13:27:25.893 #5536.7648 [SAP NCC 00 Instance] ntha-205 SAPNCC_00 soap error: 28 [The semaphore timeout period has expired., CreateFile failed in plugin_pipefopen()], count: 3

000000244 Error 13:27:32.227 #5536.7648 [SAP NCC 00 Instance] ntha-298 Created SOAP hang dump: SAPNCC_00, PID: 6708, dump file: C:\Windows\TEMP\_sapstartsrv_00.dmp

or cluster.log shows error ntha-200:

000000245 Error 13:27:32.228 #5536.7648 [SAP NCC 00 Instance] ntha-200 SAPNCC_00 IsAliveNot after timeout: 308 sec, SOAP failures count: 3, the last fault: The semaphore timeout period has expired., CreateFile failed in plugin_pipefopen()

Problem: SAP Resource permanently fails to call sapstartsrv web server. The communication between saprc.dll and sapstartsrv.exe using SOAP protocol doesn't work.

Solution: Sapstartsrv.exe may hang or is busy in CCMS thread or another strange behavior occured in the OS or the SOAP interface.


Collect the dump files along with additional info and contact support (see above).



ntha-211


Cluster.log shows this:

000026630 Error 07:34:49.841 #6492.7268 [SAP BUZ 00 Service] ntha-211 srv: SAPBUZ_99, OpenServiceW() failed: 1060

Problem: SAP Service fails to start, because it does not exist!

Solution: Check SAP Service resource properties in Failover Cluster Manager. In this example, the SAP Service is configured with instance number "00", but the SAP Service resource property of the Service Name is configured to start a service with instance number "99". A service with name "SAPBUZ_99" does not exist.



ntha-218


Cluster.log shows this:

Error 17:17:34.284 #1472.5100 [SAP TOD 00 Service] ntha-218 SAPTOD_00 failed to create SAPMNT ping file: [\\hvc6share1sapmnt\TOD\SYS\global\SAPclusterPing.txt], error: 5. You run w/out SAPMNT monitoring

Problem: Saprc.dll writes a ping file to \\<SAPGLOBALHOST>\sapmnt\<SID>\SYS\global directory. If it fails to do so while processing online request your share is not monitored.

The error number is an error from Windows. You can get more info by running net helpmsg 5 in a command-prompt. In this case the result is: "Access is denied."

Solution: To allow monitoring make sure the cluster system account (<clustername>$) has read/write access to the SAPMNT share.


Example: The name of your cluster is "sap-cluster9". Then add the account <your-domain>\sap-cluster-9 to the Security settings of SAPMNT share. Give this account full control.

 

 

ntha-225


Cluster.log shows this:

Error ntha-225 SAP PIX 00 Service online failed, service: SAPPIX_00 StartServiceW() error: 2

Problem: Saprc.dll requests the start of a service from sapstartsrv.exe. Problem: sapstartsrv.exe does not exists.

Solution: The service executable does not exists in the path. Check Service Control Manager (services.msc) and the path to the executable .You can also use Registry to check the Image Path to the executable.
OS error code 2 means: The system cannot find the file specified.


Example: sapstartsrv.exe is missing in the path, or the path is invalid, contains invalid characters, etc.


ntha-231


Cluster.log shows this:

000000032 Error 14:36:01.279 #5000.5508 [SAP TOD 00 Service] ntha-231 srv: SAPTOD_00, request: 0, cannot start service, service state: 1

Problem: SAP Service fails to start.

Solution: Investigate the sapstartsrv.exe service start problem. Look for related info in the instance \work folder.



ntha-260


Cluster.log shows this:

ERR [RES] SAP Service <SAP XXX 00 Service>: 000003123 Error ntha-260 resource SAPXXX_00 will be dead in 4 sec.

Problem: This will be logged, after ntha-368 occured. There is a problem with SAPMNT share. If the problem will not be fixed, SAPRC.DLL will initiate a failover of the SAP service in 4 sec.

Solution: See actions described for ntha-268

 

ntha-266


Cluster.log shows this:

000000040 ntha-266 000000039 ntha-266 SAP NCC 00 Service needs a dependency either on IP Address or on File Server

Problem: SAP Service needs a dependency either on the File Server resource, or on the IP address resource.

Solution: Add the missing dependency.


This screenshot shows the correct dependency on the File Server resource "SAP BUZ FileServer".




ntha-268


Cluster.log shows this:

SAP Service <SAP XXX 00 Service>: 000003912 Error ntha-268 I/O issues detected, ping file: \\<hostname>\sapmnt\XXX\SYS\global\SAPclusterPing.txt, error: 258, duration: 1 sec

Problem: SAPRC.DLL tries to access the SAPMNT share periodically. Several methods will be used: Is the share available? Is the share accessible? What is the respond time to write 26 bytes to the SAPclusterPing.txt file?

If SAPRC.DLL detects problems accessing SAPMNT share, this error will be logged.
In this example, the write time of 26 bytes takes more than 1 sec.

Solution:


Check cluster log for additional messages, for example if you use shared disks.
Check Windows application and system log for events regarding disk or filesystem problems.
If SAPMNT is located on a remote computer or storage system, check for problems there.
Check also security solutions (antivirus, host intrusion detection, firewalls) for related problems.

Please also take a closer look on the error code! In the example above you see: "error 258".

This is the error code SAPRC.DLL gets from Windows. Open a command prompt and find out, what this error code means in detail:

C:\>net helpmsg 258

The wait operation timed out.

This indicates SAPMNT share is there, it's available, it's accessible, but it hangs.

 

ntha-271


Cluster.log shows this:

SAP Service <SAP XXX 00 Service>: 000003125 Error ntha-271 ping file \\SAPGLOBALHOST\sapmnt\XXX\SYS\global\SAPclusterPing.txt

Problem: SAPRC.DLL tries to write its test-file to SAPGLOBALHOST SAPMNT share.

Solution: Check ACL security on ..\global folder. Check share permissions. Check for related errors in Windows Event Viewer => application log and system log.

 

ntha-277


Cluster.log shows this:

unhandled exception: 0xC0000005. PID: 8460, TID: 1680. Created resource crash dump: C:\Windows\TEMP\_saprc_04.dmp

Problem: A bug in saprc.dll occured.

Solution: Collect dump along with more info as described above and contact SAP support.



ntha-282


Cluster.log shows this:

000000042 ntha-282 000000041 21:23:45.410 #4692.568 [SAP BUZ 00 Instance] ntha-282 SAP BUZ 00 Instance is missing a dependency on SAP Service

Problem: SAP Resource needs a dependency on a SAP Service.

Solution: Add missing dependency.

Example: SAP Instance “BUZ” with instance number “00” must have a dependency on a service, in this case: “SAP BUZ 00 Service”.



ntha-288


Cluster.log shows this:

000000019 ntha-288 000000018 15:54:20.910 #1968.1492 [SAP NCC 00 Service] ntha-288 bad service name: ServiceName=[BadService]

Problem: The configuration of a SAP Service resource is invalid.

Solution: Correct property of SAP Service resource which contains the service name of the related sapstartsrv executable (look in Windows Service Control Manager).


Example:


Cluster Failover Manager shows this (SAPBUZ_09 - wrong!):



But Service Control Manager displays the correct service name "SAPBUZ_00":




ntha-294


Information: SAPRC.DLL continuously writes information about the used TCP and UDP ports to cluster.log. This information contains the used ports for IPv4 and IPv6 and the configured dynamic port range.

Reason for this information: Log this information for a later root cause analysis.

ntha-300


Cluster.log shows this:

000000189 ntha-300 000000188 16:29:20.944 #8548.4808 [SAP TOD 00 Service] ntha-300 SAPMNT \\hvc6share1\sapmnt has enabled CA flag. This is not supported by SAP, for more information see SAP note 2287140

Problem: The problems related with the CA feature are explained in SAP note 2287140.


Look for ntha-305 message in cluster log, which shows more details about the SAPMNT share configuration. Here is an example of a correctly configured system:

000000071 17:23:28.434 #1772.8948 [SAP TOD 00 Service] ntha-305 share: \\hvc6share1\sapmnt, protocol: SMB, version: 3.1.1, capabilities: 0x50, cluster: true, continuous availability: false

Solution: Disable continuous availability flag on the SAPMNT share and on other shares, which are maybe located on the same shared disk.




ntha-306


Cluster.log shows this:

000000015 ntha-306 000000014 15:16:52.809 #8984.1832 [SAP NCC 00 Service] ntha-306 dll: C:\Windows\Cluster\saprc.dll must be in directory: C:\Windows\system32

Problem: saprc.dll does not exist in \windows\system32.

Solution: Make sure saprc.dll is placed in \windows\system32 folder. Copy it from another cluster node.



ntha-307


Cluster.log shows this:

000002312 ntha-307 000002311 15:04:12.715 #7380.4028 [SAP NCC 00 Service] ntha-307 please upgrade this node to the newer dll version: [3.0.0.135], current dll is old: [3.0.0.134]

Problem: You have different versions of saprc.dll on cluster nodes.

Solution: Make sure all nodes have the latest saprc.dll in the \windows\system32 folder.


To identify the DLL version, run this command in an elevated PowerShell on all cluster nodes:


[System.Diagnostics.FileVersionInfo]::GetVersionInfo("saprc.dll").FileVersion



ntha-310


Cluster.log shows this:

000000066 ntha-310 000000065 12:55:42.380 #7108.4424 [SAP NCC 00 Instance] ntha-310 SAPNCC_00 IsAliveNot after timeout: 308 sec, instance is not online: HA state: yellow. HA relevant apps: 2 of 2 [ msg_server:-1=gray enserver:8260=green ]

Problem: One of the SAP important applications has failed.

Solutions (depending on the root cause):


1) You need to investigate which SAP application crashed, for example msg_server.exe (Message Server), look for the dump file, and send it to SAP for further investigation.


2) You can change the timeout in seconds in instance cluster property AcceptableYellowTime. For example: AcceptableYellowTime=300


This setting should only be changed in landscapes, where 300 seconds (5 minutes) are not enough to tollerate a "yellow state" condition in SAP MMC.


3) You can change exclusions in SAP instance cluster property HAnotRelevantApps.


Example:


We add gwrd and sapwebdisp to the "not relevant" application list. The cluster will not check these applications and will not initiate a failover, if one crashes.




ntha-312


Cluster.log shows this:

000000035 Error 14:36:51.585 #3808.3976 [SAP TOD 00 Service] ntha-312 SAPTOD_00 IsAliveNot after timeout: 45 sec, ResUtilVerifyService error: 1062

Problem: sapstartsrv.exe dies.

Solution: Look in instance \work folder for related trace files from sapstartsrv. Look for errors in Windows Application event log, for example sapstartsrv.exe crash events.


You get also a Windows error message: 1062 in this example.


To get more info, run net helpmsg 1062. The service has not been started.



ntha-313


Cluster.log shows this:

000000126 ntha-313 000000125 17:37:44.380 #1772.8948 [SAP TOD 00 Service] ntha-313 SAPTOD_00 IsAliveNot after timeout: 45 sec, cannot write to ping file: \\hvc6share1\sapmnt\TOD\SYS\global\SAPclusterPing.txt, error: 67

Problem: Ping to sapmnt share has timed out. There are quite a few APIs that may fail and you'll find more details in the cluster log, but basically the storage system which hosts SAPMNT share is not accessible and SAP system cannot operate any longer.

Solution: Fix issues with your storage system.



ntha-317


Information: SAPRC.DLL continuously writes information about the used TCP and UDP ports to cluster.log. This information contains the used ports for IPv4 and IPv6 and the configured dynamic port range.

Reason for this information: Log this information for a later root cause analysis.

ntha-318


Cluster.log shows network related messages like this:

000000040 Error 12:41:42.781 #8460.4776 [SAP NCC 00 Service] ntha-318

port usage 85 is over 70, netstat #5 Port max usage: 85 percent

tcp4 port max usage, dynamic range: 85 percent (219 of 255) on 10.20.93.49, server range: 0 percent (19 of 65280) on 0.0.0.0

udp4 port max usage, dynamic range: 0 percent (7 of 1000) on 0.0.0.0, server range: 0 percent (7 of 64535) on 127.0.0.1

tcp6 port max usage, dynamic range: 12 percent (33 of 255) on fe80::74f7:e530:2359:34b2, server range: 0 percent (8 of 65280) on fe80::74f7:e530:2359:34b2

udp6 port max usage, dynamic range: 0 percent (0 of 1000) on , server range: 0 percent (6 of 64535) on ::

Problem: The Windows OS server may be ran out of ports. You'll get notifications once port usage crosses thresholds 70%, 80% and 90%.

Solution: To see detailed port usage run insaprct.exe -port or use netstat -anob to get an overview of all ports and which application used them?


Resolve the problem or increase the dynamic port range.



ntha-322


Cluster.log shows this:

000000071 Error 14:13:25.832 #5944.8496 [SAP NCC 00 Instance] ntha-322 dependency SAP Service is in maintenance

Problem: saprc.dll detects an inconsistency in the maintenance configuration of SAP Resource and SAP Service.

Solution: Make sure maintenance mode is the same for both resources.



ntha-349


Cluster.log shows this:

2022-04-06, 11:19:47 Error 2917275 000000202 Error ntha-349 SAP XYZ 00 Service online failed, service: SAPXYZ_00 has timed out after: 49 sec

Problem: saprc.dll starts the related sapstartsrv service. In this case, the service didn't report "running" after 49 seconds.

Solution: Start the service manually using Windows Service Control Manager (services.msc). Find the root cause, why this service needed so much time to start up? (inspect sapstartsrv.log and sapstart.log)
Naming resolution problems are the root cause in 98% of such cases.

Workaround: Change the value of the parameter "AcceptableOfflineTime" in the SAP Service cluster resource. Default is 30 or 45 seconds (depends on the installation time and which SWPM was used). Increase the value to 60 seconds, or more, depending on the root cause for the long service startup time.

 

ntha-385


000000128 Error ntha-385 SAP ECP 00 Instance has a wrong dependency: Generic Service

Problem/Solution: Take this as a warning, that you're still using the old Generic Service resource for the SAP service of the (A)SCS instance. Everything works fine.

However, SAP strongly recommends to switch to the SAP Service resource! There are many enhancements made to this resource type and it's a requirement for the Rolling Kernel Switch (RKS) functionality.

You can simply change this configuration by running the PowerShell script which is included in ntclust.sar package:    Switch-SAPServiceResourceType.ps1

This script will remove the Generic Service cluster resource and will add instead the SAP Service resource.

 

ntha-390


Error ntha-390 cluster resource: SAP BF3 00 Instance, unexpected cluster terminate,  isAlive: true, state: Online

Problem/Solution: This error will be written in rare situations, for example if the cluster process (clussvc.exe) itself crashes or unexpectedly terminates.

Check for other error messages in cluster log which will explain the situation, unexpected cluster terminate, cluster resource DLL crash, etc. The root cause is a problem with the Windows cluster node.

 

ntha-410


Error ntha-410 SAP P40 20 Instance online failed, cannot start SAP system: SAPP40_20, has given up after 69 sec

Problem/Solution: This error is a follow up message. After 69 seconds the instance could not be brought up online.

Check for prior error messages in cluster log which will explain the root cause.

 

ntha-435


Error ntha-435 cluster resource SAP ABC 00 Service has detected I/O issues with sapmnt share ping file: \\<sapglobalhost>\sapmnt\ABC\SYS\global\SAPclusterPing.txt.
error: 258
ping duration: 5 seconds

Problem: Take this as a serious error! SAPRC.DLL writes continuously to a small text file in \global folder of sapmnt share. If it detects problems with this share, for example long waiting times until a write operation has been completed, then it will produce this error message.
Because of this error, SAP operations may be affected as well!

Solution:

The error message points to a high I/O load situation or a general problem with the disk, where sapmnt is located on.

Check Windows Event Viewer -> system log for more information.

Check your disk subsystem.

The ping duration in above example means, that it took 5 seconds, until saprc.dll could READ the very tiny SAPclusterPing.txt file. That's too much time! The OS error 258 points to a timeout operation on the sapmnt share.

 

ntha-450


Information: This number indicates, that the health check ("IsAlive" check in Microsoft terminology) for the SAP Service resource was successful and completed in XX seconds.

ntha-451


Information: This number indicates, that the health check ("IsAlive" check in Microsoft terminology) for the SAP Instance resource was successful and completed in XX seconds.

 

ntha-461


Error ntha-461 SAP BF3 ERS and SAP BF3 should not run on the same node wsiv1406-1

Problem: You have configured 2 cluster groups, one for an ASCS instance and one for the ERS instance. If there is only one cluster node left for operations, the cluster will bring both cluster groups online on this node. This is an expected situation.

Solution:

If there is at least one additional cluster node left, the cluster should move ERS cluster group away from the node, where the ASCS cluster group is running. This error is more a warning for you. You should interfere and move the cluster group manually, if this will not be done by the cluster itself.

 

ntha-464


Info  ntha-464 saved away c:\usr\sap\bf3\ascs00\work\dev_ms to: c:\usr\sap\bf3\ascs00\work\dev_ms_terminated.txt

Information: In case of a problem with at least one monitored process (msg_server.exe, enserver.exe, etc.), saprc.dll creates copies of the trace files of these processes for later inspection.

In this example a copy of dev_ms (the trace file of the Message server) will be copied.

ntha-465


Info  ntha-465 "cannot copy file %%%%%"

Information: In case of a problem with at least one monitored process (msg_server.exe, enserver.exe, etc.), saprc.dll creates copies of the trace files of these processes for later inspection.

ntha-465 means, that a copy operation of a trace file didn't succeed. Root cause can be, that the shared disk drive was already offline and therefore the I/O operation failed.
13 Comments