Skip to Content

In the past, implementing a highly available SAP Netweaver was never straightforward because of the requirement to use a shared cluster storage. In on-premise installations we were forced to use expensive storage arrays and in the cloud things were not easier – the only solution was to use 3rd party software that replicated disks in the background.

Did you know that you can like this post? It’s the easiest way to show your support! Just scroll up a bit and click on the big Like button.
Thanks!

Fortunately, those dark times are now a thing of the past thanks to a few changes implemented by SAP and Microsoft. We can now use the file share instead of the shared cluster disk for the Central Services instance and Microsoft released a feature Storage Spaces Direct (S2D) that implements a software-defined storage in Windows 2016. Those two innovations have a major impact on designing fault-tolerant solutions. You can deploy the entire solution in Microsoft Azure and eliminate the need of buying SIOS DataKeeper or Starwind licenses that simulate the SAN drive.

With Storage Spaces Direct you can use internal server drives to build highly scalable storage solution. It requires at least two servers with two data disks each. The implementation of fault tolerance is similar to a RAID scenario, however, in S2D, the data is distributed across servers.

There are two deployment methods available:

  • Disaggregated – storage and compute in separate clusters
  • Hyper-converged – one cluster for storage and compute

Due to SMB loopback limitations we can’t use the hyper-converged deployment model. We will create two separate failover clusters – on the first one we will enable the Storage Spaces Direct and Scale-Out File Server functionalities and on the second one, we will install SAP Central Services Instance.

Currently, the installation of highly available central services instance is not yet supported in Software Provisioning Manager, therefore we will perform a local installation of ASCS and ERS and then we will manually build the failover cluster by registering SAP resources.

RESOURCE PROVISIONING

To deploy Storage Spaces Direct and Scale-Out File Server I provisioned following Azure resources:

Resource type Resource name Comments
Resource Group S2D-FC1 Resource Group for S2D Cluster
Virtual Machine s2d-fc1-0 Add two data disks
Virtual Machine s2d-fc1-1 Add two data disks
Storage Account s2dfc1cw Cloud Witness (General Purpose, LRS, Stanard Performance)
Availability Set FC1
Resource Group S2D-FC2 Resource group for ASCS instance server (deployed through template)
Virtual Machine s2d-xscs-0
Virtual Machine s2d-xscs-1
Availability Set FC2
Load Balancer s2d-lb-xscs
Storage Account s2dfc2cw Cloud Witness (General Purpose, LRS, Stanard Performance)

You can see that our landscape is a bit more complex than usual, as we are building 4 VMs. In order to build failover cluster, it is required to have a domain controller. Mine is already deployed and I won’t discuss the Active Directory set up in this post.

You can provision the required resources manually or you can also use the quickstart templates.

ASCS template: sap-3-tier-marketplace-image-multi-sid-xscsmd

S2D template: 301-storage-spaces-direct

SCALE-OUT FILE SERVER CONFIGURATION

When our VMs are ready we can start the configuration. We already saw how to prepare the failover cluster using the GUI wizards, so I want to show you how can you save some time using PowerShell for basic tasks.

Add computer to domain:

$domain = "myDomain"
$user = "username"
$password = "myPassword!" | ConvertTo-SecureString -asPlainText -Force
$username = "$domain\$user" 
$credential = New-Object System.Management.Automation.PSCredential($username,$password)
Add-Computer -DomainName $domain -Credential $credential

Install Failover Clustering role:

Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools

Joining the computer to the domain requires a restart, so we have time to make a cup of coffee. Next step is to create a Windows Failover Cluster:

New-Cluster –Name <cluster_name> –Node <node1, node2> –StaticAddress <Load_balancer_ip> –NoStorage

Creating a cluster quorum is also part of failover cluster config. We use the storage account created during hardware provisioning:

Set-ClusterQuorum -CloudWitness -AccountName <StorageAccountName> -AccessKey <StorageAccountAccessKey>

The storage account name and the access key can be found in Azure Portal:

The performance of each of the four disks we’ve created is exactly the same and it’s limited by the disk size, therefore we won’t benefit from the caching. If you’d like the caching to your solution, you need to add additional high-performance disks.

We can display the available disks in our cluster with the command:

Get-PhysicalDisk

Disks with the value True in Column CanPool will be used to create data pool.

Execute the following command to enable Storage Spaces Direct:

Enable-ClusterS2D

The storage system is now set to Storage Spaces Direct mode which creates a single data pool called “S2D on S2D-FC1” which is a logical representation of the four disks attached to the server. You can display it with the Get-StoragePool

Get-StoragePool | FT FriendlyName, FaultDomainAwarenessDefault, OperationalStatus, HealthStatus -autosize

The allocation of space in the storage pool is done through the creation of volumes. We require storage resiliency so in the next command we configure the ResiliencySettingName parameter. There are different resiliency options available. We use the mirroring which provides the fault tolerance by keeping multiple copies of data.

New-Volume -StoragePoolFriendlyName <FriendlyName> -FriendlyName <Storage Friendly name> -FileSystem CSVFS_ReFS -Size <size> -ResiliencySettingName Mirror

The newly created volume is accessible under C:\ClusterStorage\Volume1. Notice it doesn’t have a drive letter assigned.

Finally, we can add the Scale-Out File Server role to enable a file share for the central services.

Add-ClusterScaleOutFileServerRole -Name <sapglobalhost> -cluster <cluster name>

It wasn’t so difficult! We have configured a highly available software-defined storage, that we will use for ASCS installation. For testing purposes, you can copy a file to C:\ClusterStorage\Volume1 and see that the file is available on both hosts. You can also shut down one of the cluster nodes to see, that the files did not disappear.

CENTRAL SERVICES INSTANCE INSTALLATION

In the current release of Software Provisioning Manager, the installation of highly available central services instance is not yet supported. The workaround is to perform local installations and build the failover cluster manually.

You can build the failover cluster using the commands I have presented in the previous chapter. You can continue reading after creating a quorum witness.

Use command Get-ClusterResource | Get-ClusterParameter to display all cluster parameters.

When the failover cluster is built, you can start the Software Provisioning Manager. Install the ASCS and ERS instances on both nodes of the cluster. Remember, that we can’t use the highly available install – go for the distributed one and install the instances on local drives.

The SID of my system is S2D. I’m installing it on the C: drive. It is important that you use the same settings during the installation on the second host.

The standard load balancing rules created during template provisioning assumes that your instance number for the ASCS is 00. If you wish to use any different instance number it will be required to modify the load balancing rules.

As we are doing a local installation, please use your node hostname.

The installation is complete:

You can quickly check in MMC that all components are running successfully on both hosts.

We have performed the same installation on both hosts, so the profile parameters should be same. I have used an addon to the Notepad++ to compare the files. The only difference is the hostname, so no surprises here.

To preserve the lock table during the failover it is required to install additional ERS instance. When using highly available installation option this step is performed automatically. In our case, we have to run the Software Provisioning Manager again and install the Enqueue Replication Server independently.

Software Update Manager automatically detects the ASCS instance:

During the install, I received an unexpected error message. My instance did not start up.

I opened the dev_enrepsrv file and I saw many connection errors:

[Thr 4732] Sat Feb 17 18:27:36 2018
[Thr 4732] ***LOG Q0I=> NiPConnect2: 10.0.0.11:50016: connect (10061: WSAECONNREFUSED: Connection refused) [D:/depot/bas/749_REL/src/base/ni/nixxi.cpp 3428]
[Thr 4732] *** ERROR => NiPConnect2: SiPeekPendConn failed for hdl 3/sock 768
    (SI_ECONN_REFUSE/10061; I4; ST; 10.0.0.11:50016) [nixxi.cpp    3428]
[Thr 4732] *** ERROR => EncNiConnect: unable to connect (NIECONN_REFUSED). See SAP note 1943531 [encomi.c     445]
[Thr 4732] *** ERROR => RepServer: main: no connection to Enqueue Server (rc=-7; ENC_ERR_REFUSED) => try again in 1600 ms (see SAP note 1943531) [enrepserv.cp 833]

The fix is quite simple and requires changing parameter in default profile of ERS instance:

system/secure_communication = ON

Before going back to the SWPM try to manually start the ERS instance in SAP MMC. When it’s working correctly you can continue the installation.

Finally, when the execution is finished shut down ASCS and ERS instances and their services.

CONFIGURE SHARE

The Storage Spaces Direct and Scale-Out File Server functionalities enabled in the previous chapter mirrors attached disks to guarantee fault tolerance. Our highly available scenario requires a file share that will be used for serving the sapmnt.

Go to the Failover Cluster Manager, select Roles, choose the Scale-Out File Server role that we have defined and click Add File Share.

The attached storage in the S2D cluster can be used by many different applications – not only the SAP Central Services. In this step, you can decide if you wish to create the file share for entire cluster volume or just selected path.

Assign full control to the share to following objects:

<domain>\SAP_<SID>_GlobalAdmin

<domain>\<clusternode1>$

<domain>\<clusternode2>$

<domain>\<installation_user>

Remember to modify the security permissions accordingly.

You can log in to the ASCS instance and try to open the share. You can see it’s working well on my servers:

During the installation, the SWPM has created a local sapmnt shares on each of the cluster nodes. But as we are going to reference the newly created file location in the system profile, we don’t need it and we can safely delete it.

Change the permissions for the outstanding share saploc and assign full control to group <domain>\SAP_<SID>_GlobalAdmin. Delete Everyone from the list.

The file share should contain the SYS directory of the instance, so we need to redistribute the SAP libraries. Under the sapmnt share create directory <SID> and move the C:\usr\sap\SYS

On the secondary node delete the SYS directory completely.

MODIFY PROFILE PARAMETERS

We need to implement a few corrections to the profile parameters to point the ASCS instance to the load balancer and not a single node. We have moved the SYS directory already to the new file share, so all the profile files can be found there.

The first thing to do here is to update the file name. In all SAP installations, the profile parameter file name consists of:

  •                 System ID
  •                 Instance type and number
  •                 Hostname

We want to change the hostname from the one that points to the single cluster node to the one that will be common for the entire failover cluster. We will assign this hostname to the load balancer IP during the resource provisioning in a failover cluster.

Open the DEFAULT profile in Notepad and correct following parameters:

SAPGLOBALHOST – should point to the Scale-Out File Server hostname

rdisp/mshost – hostname assigned to ASCS instance

enque/serverhost – hostname assigned to ASCS instance

In the ASCS profile file modify following parameters:

SAPGLOBALHOST – should point to the Scale-Out File Server hostname

DIR_PROFILE – should point to the Scale-Out File Server hostname

_PF – correct instance profile file name

service/check_ha_node – add new parameter with value 1

Also modify following lines by changing Restart to Start:

In the ERS instance profile which is located under C:\usr\sap\<SID>\ERS10\profile modify parameter SCSHOST to point to the target ASCS hostname.

We need also to correct the environment variables defined during the install. Log in as <SID>adm and correct the following lines:

The last thing to change is to modify the startup parameters for SAP services. The easiest way to do it is to modify entries in Windows registry. Open regedit, go to HKLM\System\CurrentControlSet\Services and correct the profile file name in the ImagePath.

CREATE FAILOVER CLUSTER RESOURCES

We have cleared all the preparation steps, so we can move to the actual failover cluster configuration. We start with installing the latest SAP NTCLUST libraries that are registering resource definitions in Failover Cluster. The support for using a file share comes in the latest release which you can obtain on SAP ONE Launchpad. Execute the following command:

insaprct.exe -install

After few seconds we get confirmation the installation went fine:

When the additional resources are registered you can open the Failover Cluster Manager and assign the full control permission to the group SAP_<SID>_GlobalAdmin.

We are ready to create SAP roles and resources. Create Empty role from the context menu of the cluster. Change its name to SAP <SID>.

Select the newly created role and add Client Access Point resource. The cluster hostname has to be the same as we have defined during the profile parameters modification.

Next, display the IP Address properties and modify the assigned address. It should point to your Load Balancer IP. In the same windows change the resource name to SAP <SID> IP.

In Failover Cluster the ASCS is running only on one node. The traffic to the correct server is directed based on the health probe. You can use following script delivered by Microsoft to create such probe for SAP resources.

$SAPSID = "PR1"      # SAP <SID>
$ProbePort = 62000   # ProbePort of the Azure Internal Load Balancer

Clear-Host
$SAPClusterRoleName = "SAP $SAPSID"
$SAPIPresourceName = "SAP $SAPSID IP"
$SAPIPResourceClusterParameters =  Get-ClusterResource $SAPIPresourceName | Get-ClusterParameter
$IPAddress = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "Address" }).Value
$NetworkName = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "Network" }).Value
$SubnetMask = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "SubnetMask" }).Value
$OverrideAddressMatch = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "OverrideAddressMatch" }).Value
$EnableDhcp = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "EnableDhcp" }).Value
$OldProbePort = ($SAPIPResourceClusterParameters | Where-Object {$_.Name -eq "ProbePort" }).Value

$var = Get-ClusterResource | Where-Object {  $_.name -eq $SAPIPresourceName  }

Write-Host "Current configuration parameters for SAP IP cluster resource '$SAPIPresourceName' are:" -ForegroundColor Cyan
Get-ClusterResource -Name $SAPIPresourceName | Get-ClusterParameter

Write-Host
Write-Host "Current probe port property of the SAP cluster resource '$SAPIPresourceName' is '$OldProbePort'." -ForegroundColor Cyan
Write-Host
Write-Host "Setting the new probe port property of the SAP cluster resource '$SAPIPresourceName' to '$ProbePort' ..." -ForegroundColor Cyan
Write-Host

$var | Set-ClusterParameter -Multiple @{"Address"=$IPAddress;"ProbePort"=$ProbePort;"Subnetmask"=$SubnetMask;"Network"=$NetworkName;"OverrideAddressMatch"=$OverrideAddressMatch;"EnableDhcp"=$EnableDhcp}

Write-Host

$ActivateChanges = Read-Host "Do you want to take restart SAP cluster role '$SAPClusterRoleName', to activate the changes (yes/no)?"

if($ActivateChanges -eq "yes"){
Write-Host
Write-Host "Activating changes..." -ForegroundColor Cyan

Write-Host
write-host "Taking SAP cluster IP resource '$SAPIPresourceName' offline ..." -ForegroundColor Cyan
Stop-ClusterResource -Name $SAPIPresourceName
sleep 5

Write-Host "Starting SAP cluster role '$SAPClusterRoleName' ..." -ForegroundColor Cyan
Start-ClusterGroup -Name $SAPClusterRoleName

Write-Host "New ProbePort parameter is active." -ForegroundColor Green
Write-Host

Write-Host "New configuration parameters for SAP IP cluster resource '$SAPIPresourceName':" -ForegroundColor Cyan
Write-Host
Get-ClusterResource -Name $SAPIPresourceName | Get-ClusterParameter
}else
{
Write-Host "Changes are not activated."
}

(source: docs.microsoft.com)

Go back to Failover Cluster Manager and create SAP service:

In the opened window we can modify the SAP Service properties. In the first tab change the name of the service to “SAP <SID> <Instance> Service”.

On the dependency tab add new resource “Name: hostname”

The third tab should have a value 0 in the field Maximum restarts in the specified period.

Select Run this resource in a separate Resource Monitor on the fourth tab.

And finally, update the Service Name in the last tab.

Next step is to add another component to the cluster – SAP Resource. All the properties should be the same as in SAP Service. The only difference is on the last tab, where we need to update the fields with the SAP instance number and System ID.

That was the last required configuration step. You can bring all the resources online. Your screen should look like the example below:

In case you encounter problems when bringing resources online I suggest checking the Windows Event Log and sapstartsrv log files.

VERIFICATION

We have finished the configuration required to enable the High Availability feature of SAP Central Services instance. That means in the landscape there is not a single point of failure. In case of the node failure, the operations will be redirected by Load Balancer to the active node. The lock table entries are constantly replicated to the Enqueue Replication Server which always runs on the second server.

In the SAP MMC you can see that the running services are:

ASCS instance on the primary cluster node

ERS instance on the secondary cluster node

We can start the verification by confirming the replication server is connected to the ASCS instance.

ensmon pf=C:\usr\sap\S2D\ERS10\profile\S2D_ERS10_s2d-xscs-1 2

It looks good. Let’s create some sample data in the lock table to see how the system behaves during the failover.

enqt pf=C:\usr\sap\S2D\ERS10\profile\S2D_ERS10_s2d-xscs-1 11 10

The log entries can be seen in the MMC:

Now it’s the BIG moment. I’m going to shut down the node s2d-xscs-0. I expect that the Failover Cluster will immediately start the ASCS on the secondary node. The ERS instance is running, so all the lock table entries should be synchronized.

I have shut down the primary node. After few seconds the instance on secondary node started. In SAP MMC we can see that both ASCS and ERS services are unavailable. The first test passed! What about the lock entries?

All table lock entries are imported by the newly started instance. Success! We can open the champagne and celebrate!

To report this post you need to login first.

7 Comments

You must be Logged on to comment or reply to a post.

    1. Bartosz Jarkowski Post author

      Hello Peter,

      that’s very good question.

      Azure Backup Server supports Storage Spaces Direct, so that would be my first choice (but I haven’t tested it yet). However you will be able to backup the files, but not the entire VM.

       

      (0) 

Leave a Reply