Skip to Content

Many customers operate SAP NetWeaver based products in a Windows Failover Cluster and many of You asked for a troubleshooting guide for cluster related problems.

This blog lists error messages of SAP’s Windows Failover Cluster DLL (saprc.dll) and gives hints and solutions for known problems of SAP software in Failover Clusters.

If you see a related event from saprc.dll in the Windows application log, just click the URL in the event description and you will be redirected to this document, pointing to the related error code. You will then find a more detailed problem description and solutions to solve the issue.

This guide refers to saprc.dll with file version > 1.0.
If you use an older saprc.dll (with saprcex.dll and sapclus.dll) then this document does not apply.

Common How To Troubleshooting Steps

Question: I have installed a new Windows Failover Cluster, how can I register SAP Resource Types (I am not using SWPM (SAPInst))?

Answer: To register SAP cluster Resource Types run: insaprct.exe -reg

To unregister run: insaprct.exe -unreg

 

Question: Where can I find information, warnings and error messages from saprc.dll?

Answer: Look for saprc entries in Windows Event Viewer (eventvwr.exe):

  • Open Application event log and set a filter on event sources = “SAP cluster”.
  • dll may trigger event ID=12345, 35306

 

Question: Where can I find the cluster node’s log and how can I read it?

Answer: Open a PowerShell with administrative rights and run:

get-clusterlog -destination <directory> -uselocaltime -timespan <minutes>

The command will retrieve all the cluster.log from all cluster nodes and will store them in a folder specified by <directory>.
The -uselocaltime switch uses the local Windows time settings, otherwise all timestamps in the cluster log will be displayed using UTC timezone.
The -timespan parameter should be set to a time period, where the error occurred.

The cluster.log can be read with any Windows editor, in this example we use notepad to read it:

Example:

PS C:\temp> get-clusterlog -destination c:\temp -uselocaltime -timespan 60

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       10.04.2017     12:13        2934310 wsiv1006-1_cluster.log
-a----       10.04.2017     12:12        8695784 wsiv1006-2_cluster.log

PS C:\temp> notepad .\wsiv1006-1_cluster.log

Hint: When reading a cluster.log, first go to the end of the log and step forward to find the latest entries.

 

Question: Which network ports are being used by SAP software?

Answer: To view the current port usage run: insaprct.exe -ports

You will get an overview of all used ports and the current settings of the dynamic port ranges.

 

Question: SAP support asks for a crash dump? Where can I find it?

Answer: SAP Resource DLL (saprc.dll) may create one or more crash dumps in case of an unexpected failure. Dumps are created in the “temp” directory.

Example:

  • C:\Windows\Temp\_saprc_01.dmp
  • C:\Windows\Temp\_sapstartsrv_04.dmp

When a dump is created, also a message is written in the Windows Application event log with the path of the dump file path and some additional information.

To provide all necessary data for SAP support, collect this:

  • *.dmp dumps above
  • saprc.dll
  • saprc.pdb
  • cluster log from the affected cluster node
  • application event log from the affected cluster node

Zip all files together and send the zipped file to SAP support.


Specific ntha error messages

In this section you find the error codes from saprc.dll. Look in Windows event log for events of “SAP cluster” source or generate cluster.log to find the related entries.

ntha-160

Cluster.log shows this:

000000329 Error 15:16:15.560 #5928.6908 [SAP NCC 00 Instance] ntha-160 SAPNCC_00 soap start error: 28, The wait operation timed out., WaitForSingleObject failed in plugin_pipefrecv()

Problem: SAP Resource fails to start.

Solution: Investigate and fix an issue with SAP system.

ntha-200

Cluster.log shows errors in range 2xx like these examples:

000000227 Info 13:22:30.451 #5536.7648 [SAP NCC 00 Instance] ntha-204 SAPNCC_00 before soap ping

000000228 Error 13:23:00.470 #5536.7648 [SAP NCC 00 Instance] ntha-205 SAPNCC_00 soap error: 28 [The wait operation timed out., WaitForSingleObject failed in plugin_pipefrecv()], count: 1

000000229 Error 13:23:00.471 #5536.7648 [SAP NCC 00 Instance] ntha-199 SAPNCC_00 is not alive, IsAlive will time out in 175 sec

000000230 Info 13:23:00.481 #5536.7648 [SAP NCC 00 Instance] ntha-201 SAPNCC_00 end instance #9 IsAliveInternal: false

000000243 Error 13:27:25.893 #5536.7648 [SAP NCC 00 Instance] ntha-205 SAPNCC_00 soap error: 28 [The semaphore timeout period has expired., CreateFile failed in plugin_pipefopen()], count: 3

000000244 Error 13:27:32.227 #5536.7648 [SAP NCC 00 Instance] ntha-298 Created SOAP hang dump: SAPNCC_00, PID: 6708, dump file: C:\Windows\TEMP\_sapstartsrv_00.dmp

or cluster.log shows error ntha-200:

000000245 Error 13:27:32.228 #5536.7648 [SAP NCC 00 Instance] ntha-200 SAPNCC_00 IsAliveNot after timeout: 308 sec, SOAP failures count: 3, the last fault: The semaphore timeout period has expired., CreateFile failed in plugin_pipefopen()

Problem: SAP Resource permanently fails to call sapstartsrv web server. The communication between saprc.dll and sapstartsrv.exe using SOAP protocol doesn’t work.

Solution: Sapstartsrv.exe may hang or is busy in CCMS thread or another strange behavior occured in the OS or the SOAP interface.

Collect the dump files along with additional info and contact support (see above).

ntha-211

Cluster.log shows this:

000026630 Error 07:34:49.841 #6492.7268 [SAP BUZ 00 Service] ntha-211 srv: SAPBUZ_99, OpenServiceW() failed: 1060

Problem: SAP Service fails to start, because it does not exist!

Solution: Check SAP Service resource properties in Failover Cluster Manager. In this example, the SAP Service is configured with instance number “00”, but the SAP Service resource property of the Service Name is configured to start a service with instance number “99”. A service with name “SAPBUZ_99” does not exist.

ntha-218

Cluster.log shows this:

000000024 Error 17:17:34.284 #1472.5100 [SAP TOD 00 Service] ntha-218 SAPTOD_00 failed to create SAPMNT ping file: [\\hvc6share1sapmnt\TOD\SYS\global\SAPclusterPing.txt], error: 5. You run w/out SAPMNT monitoring

Problem: Saprc.dll writes a ping file to \\<SAPGLOBALHOST>\sapmnt\<SID>\SYS\global directory. If it fails to do so while processing online request your share is not monitored.

The error number is an error from Windows. You can get more info by running net helpmsg 5 in a command-prompt. In this case the result is: “Access is denied.”

Solution: To allow monitoring make sure the cluster system account (<clustername>$) has read/write access to the SAPMNT share.

ntha-231

Cluster.log shows this:

000000032 Error 14:36:01.279 #5000.5508 [SAP TOD 00 Service] ntha-231 srv: SAPTOD_00, request: 0, cannot start service, service state: 1

Problem: SAP Service fails to start.

Solution: Investigate the sapstartsrv.exe service start problem. Look for related info in the instance \work folder.

ntha-266

Cluster.log shows this:

000000040 ntha-266 000000039 ntha-266 SAP NCC 00 Service needs a dependency either on IP Address or on File Server

Problem: SAP Service needs a dependency either on the File Server resource, or on the IP address resource.

Solution: Add the missing dependency.

This screenshot shows the correct dependency on the File Server resource “SAP BUZ FileServer”.

ntha-277

Cluster.log shows this:

unhandled exception: 0xC0000005. PID: 8460, TID: 1680. Created resource crash dump: C:\Windows\TEMP\_saprc_04.dmp

Problem: A bug in saprc.dll occured.

Solution: Collect dump along with more info as described above and contact SAP support.

ntha-282

Cluster.log shows this:

000000042 ntha-282 000000041 21:23:45.410 #4692.568 [SAP BUZ 00 Instance] ntha-282 SAP BUZ 00 Instance is missing a dependency on SAP Service

Problem: SAP Resource needs a dependency on a SAP Service.

Solution: Add missing dependency.

Example: SAP Instance “BUZ” with instance number “00” must have a dependency on a service, in this case: “SAP BUZ 00 Service”.


ntha-288

Cluster.log shows this:

000000019 ntha-288 000000018 15:54:20.910 #1968.1492 [SAP NCC 00 Service] ntha-288 bad service name: ServiceName=[BadService]

Problem: The configuration of a SAP Service resource is invalid.

Solution: Correct property of SAP Service resource which contains the service name of the related sapstartsrv executable (look in Windows Service Control Manager).

Example:

Cluster Failover Manager shows this (SAPBUZ_09 – wrong!):

But Service Control Manager displays the correct service name “SAPBUZ_00”:

ntha-300

Cluster.log shows this:

000000189 ntha-300 000000188 16:29:20.944 #8548.4808 [SAP TOD 00 Service] ntha-300 SAPMNT \\hvc6share1\sapmnt has enabled CA flag. This is not supported by SAP, for more information see SAP note 2287140

Problem: The problems related with the CA feature are explained in SAP note 2287140.


Look for ntha-305 message in cluster log, which shows more details about the SAPMNT share configuration. Here is an example of a correctly configured system:

000000071 17:23:28.434 #1772.8948 [SAP TOD 00 Service] ntha-305 share: \\hvc6share1\sapmnt, protocol: SMB, version: 3.1.1, capabilities: 0x50, cluster: true, continuous availability: false

Solution: Disable continuous availability flag on the SAPMNT share and on other shares, which are maybe located on the same shared disk.

ntha-306

Cluster.log shows this:

000000015 ntha-306 000000014 15:16:52.809 #8984.1832 [SAP NCC 00 Service] ntha-306 dll: C:\Windows\Cluster\saprc.dll must be in directory: C:\Windows\system32

Problem: saprc.dll does not exist in \windows\system32.

Solution: Make sure saprc.dll is placed in \windows\system32 folder. Copy it from another cluster node.

ntha-307

Cluster.log shows this:

000002312 ntha-307 000002311 15:04:12.715 #7380.4028 [SAP NCC 00 Service] ntha-307 please upgrade this node to the newer dll version: [3.0.0.135], current dll is old: [3.0.0.134]

Problem: You have different versions of saprc.dll on cluster nodes.

Solution: Make sure all nodes have the latest saprc.dll in the \windows\system32 folder.

To identify the DLL version, run this command in an elevated PowerShell on all cluster nodes:

[System.Diagnostics.FileVersionInfo]::GetVersionInfo(“saprc.dll”).FileVersion

ntha-310

Cluster.log shows this:

000000066 ntha-310 000000065 12:55:42.380 #7108.4424 [SAP NCC 00 Instance] ntha-310 SAPNCC_00 IsAliveNot after timeout: 308 sec, instance is not online: HA state: yellow. HA relevant apps: 2 of 2 [ msg_server:-1=gray enserver:8260=green ]

Problem: One of the SAP important applications has failed.

Solutions (depending on the root cause):

1) You need to investigate which SAP application crashed, for example msg_server.exe (Message Server), look for the dump file, and send it to SAP for further investigation.

2) You can change the timeout in seconds in instance cluster property AcceptableYellowTime. For example: AcceptableYellowTime=300

This setting should only be changed in landscapes, where 300 seconds (5 minutes) are not enough to tollerate a “yellow state” condition in SAP MMC.

3) You can change exclusions in SAP instance cluster property HAnotRelevantApps.

Example:

We add gwrd and sapwebdisp to the “not relevant” application list. The cluster will not check these applications and will not initiate a failover, if one crashes.

ntha-312

Cluster.log shows this:

000000035 Error 14:36:51.585 #3808.3976 [SAP TOD 00 Service] ntha-312 SAPTOD_00 IsAliveNot after timeout: 45 sec, ResUtilVerifyService error: 1062

Problem: sapstartsrv.exe dies.

Solution: Look in instance \work folder for related trace files from sapstartsrv. Look for errors in Windows Application event log, for example sapstartsrv.exe crash events.

You get also a Windows error message: 1062 in this example.

To get more info, run net helpmsg 1062. The service has not been started.

ntha-313

Cluster.log shows this:

000000126 ntha-313 000000125 17:37:44.380 #1772.8948 [SAP TOD 00 Service] ntha-313 SAPTOD_00 IsAliveNot after timeout: 45 sec, cannot write to ping file: \\hvc6share1\sapmnt\TOD\SYS\global\SAPclusterPing.txt, error: 67

Problem: Ping to sapmnt share has timed out. There are quite a few APIs that may fail and you’ll find more details in the cluster log, but basically the storage system which hosts SAPMNT share is not accessible and SAP system cannot operate any longer.

Solution: Fix issues with your storage system.

ntha-318

Cluster.log shows network related messages like this:

000000040 Error 12:41:42.781 #8460.4776 [SAP NCC 00 Service] ntha-318

port usage 85 is over 70, netstat #5 Port max usage: 85 percent

tcp4 port max usage, dynamic range: 85 percent (219 of 255) on 10.20.93.49, server range: 0 percent (19 of 65280) on 0.0.0.0

udp4 port max usage, dynamic range: 0 percent (7 of 1000) on 0.0.0.0, server range: 0 percent (7 of 64535) on 127.0.0.1

tcp6 port max usage, dynamic range: 12 percent (33 of 255) on fe80::74f7:e530:2359:34b2, server range: 0 percent (8 of 65280) on fe80::74f7:e530:2359:34b2

udp6 port max usage, dynamic range: 0 percent (0 of 1000) on , server range: 0 percent (6 of 64535) on ::

Problem: The Windows OS server may be ran out of ports. You’ll get notifications once port usage crosses thresholds 70%, 80% and 90%.

Solution: To see detailed port usage run insaprct.exe -port or use netstat -anob to get an overview of all ports and which application used them?

Resolve the problem or increase the dynamic port range.

ntha-322

Cluster.log shows this:

000000071 Error 14:13:25.832 #5944.8496 [SAP NCC 00 Instance] ntha-322 dependency SAP Service is in maintenance

Problem: saprc.dll detects an inconsistency in the maintenance configuration of SAP Resource and SAP Service.

Solution: Make sure maintenance mode is the same for both resources.

 

ntha-385

000000128 Error ntha-385 SAP ECP 00 Instance has a wrong dependency: Generic Service

Problem/Solution: Take this as a warning, that you’re still using the old Generic Service resource for the SAP service of the (A)SCS instance. Everything works fine.

However, SAP strongly recommends to switch to the SAP Service resource! There are many enhancements made to this resource type and it’s a requirement for the Rolling Kernel Switch (RKS) functionality.

You can simply change this configuration by running the PowerShell script which is included in ntclust.sar package:    Switch-SAPServiceResourceType.ps1

This script will remove the Generic Service cluster resource and will add instead the SAP Service resource.

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Karl-Heinz Hochmuth Post author

     

    Hi Yatin,

    you’re right, sorry for that! Please use the attached PowerShell script (part of ntclust.sar package) to change your Generic Service resource to a “SAP Service” resource. Then this error message will disapear. This “Error” is more a “Warning” to get rid of the old Generic Service.

    You have to change this service if you want to use the RKS functionality (Rolling Kernel Switch).

    But we recommend to change this anyway, even you don’t want to use RKS. The SAP Service resource is more stable and controlled by saprc.dll directly.

    I will add this error 385 to the blog document very soon!

     

    Best regards,

    Kalle

    (0) 

Leave a Reply