Skip to Content
Technical Articles
Author's profile photo Anjan Banerjee

SAP on Azure: Highly Available (HA) setup of SAP Netweaver 7.5 on DB2 11.5 on Windows using SIOS Datakeeper

Introduction

This blog provides the guidance for setting up Highly Available (HA) SAP Netweaver 7.5 on IBM DB2 11.5 database in Windows environment. There are 2 ways to achieve HA setup of DB2 DB.

  • Shared Disk based Setup (covered in this blog)
  • HADR Based Setup – Share Nothing Cluster

HADR with Shared Disk based Setup – This is the classical HA setup in which 2 DB nodes (in 2 VMs) share the same disks for DB DATA & LOG files and Windows Server Failover Clustering (WSFC) manages the automatic failover of the cluster resources (DB, shared Storage, and virtual IP) in case of Primary VM Node unavailability. This blog will cover setting up DB2 HA setup using shared disk approach. In Azure, we can use SIOS Datakeeper to create shared disk for DB2 database and integrate with Windows cluster (WSFC).

HADR shared nothing cluster is a software based replication method for DB2 DB in which 2 separate nodes of database runs on separate VMs with non-shared storage. Native database replication method is used for synchronization of database. Only challenge is that integration with WSFC is not available so failover of DB2 DB to 2nd Node (VM) will not be automatic.

This blog can be used as reference document for highly available SAP Production system deployment for DB2 Database on Windows, and it does not cover other aspects of SAP system deployment like SAP ASCS/ERS layer, Application Servers, security, performance, monitoring.

Overview of High Availability Setup of IBM DB2 for SAP Environment

DB2 LUW offers a number of different High Availability configurations.  There are two fundamental High Availability topologies supported by DB2 LUW.

  • Shared Disk based Setup – Shared everything cluster (Supported and covered in this blog)
  • HADR Based Setup – Shared nothing cluster

In Linux based HA setup, DB2 LUW provides Pacemaker Resource Agents for HADR Based Setup.

For Windows based HA setup, there are no native Cluster Agents for HADR Based Setup.  HADR is only supported on Windows platform with Tivoli Agents[1].  DB2 High Availability Disaster Recovery (HADR) and IBM® Tivoli System Automation for Multiplatforms (SA MP) agents are not aware of the Azure Load Balancer probe port, therefore it is not currently possible to configure HADR Based shared nothing clustering on Windows on Azure Public Cloud.

System Design

Below diagram is a high-level architecture design for the reference setup of SAP Netweaver on DB2 on Windows HA deployment. It covers the redundancy of all the single point of failures within the SAP environment and auto failover feature in case of failure of any component.

Deployment is done using availability zone concept in Azure which provides 99.99% of VM availability for each set of VMs across AZ. Application servers within AZ can also be part of availability set.

Same setup can be deployed with availability sets which provide 99.95% of VM availability for each set of VMs in the AvSet.

The above design can be extended to another region for Disaster Recovery deployment in which SAP Application layer can be replicated across region using Azure Site Recovery. For Database layer, SIOS Datakeeper based async replication can be setup for DR DB VM across region.

Following are the details for each of the components of the setup:

IBM DB2 Database Layer

For HA of DB2 database on windows, we can install DB2 DB on 2 node(VMs) WSFC cluster with SIOS Datakeeper. Both of these VMs will have same number & sizes of Azure Premium/ultra tier managed disk(LRS) attached. SIOS Datakeeper performs synchronization of the DATA & LOG disks/drives between these 2 nodes(VMs) and present these disks as shared disk to WSFC cluster. Windows cluster have resources/routines to manage the auto failover of DB2 database between primary and secondary DB nodes and along with virtual hostname/IP and attached Disks.

SAP ASCS/ERS Layer

HA – SAP ASCS/ERS is deployed with WSFC cluster and Azure Standard Load Balancer is used for defining Virtual IP/hostname of ASCS.

DR – Azure Site Replication (ASR) can be used to replicate the VM to secondary region.

 SAPMNT Layer

SAP system HA setup requires highly available ‘sapmnt’ which is mounted across ASCS/ERS, database and application server VMs. This setup is using Azure Premium Files SMB which is PaaS fileshare service in Azure and very convenient to deploy. It is recommended to use ZRS type Azure storage account for cross zone high availability of Azure Files SMB. ‘sapmnt’ contents in Azure files can be synced to another Azure Files in the Secondary region for disaster recovery setup.

Azure provides multiple other options to setup the ‘sapmnt’ fileshare like SOFS cluster, SIOS Datakeeper Fileshare, Azure Netapp Files(premium). Azure Shared Disk is another option to setup the ASCS/ERS cluster with shared disk for ‘sapmnt’ on NetWeaver 7.40 and higher releases (SAP Note 2698948).

SAP Application Servers Layer

Multiple SAP Application server are deployed in Av Set in each AZ for load balancing users and proving high availability using SAP logon groups. ASR is used for replication of VMs across region. Proximity Placement Group (PPG) should be defined to co-locate SAP application and Database layer and minimize the latency.

Overview of Deployment Steps

Following are the high-level steps which need to be followed for HA setup of SAP NetWeaver with IBM DB2 DB on Windows.

  1. Preparations
  2. Azure Premium Files SMB File Share for ‘sapmnt’ or alternate options for ‘sapmnt’ share.
  3. SAP ASCS/ERS HA Setup
  4. IBM DB2 Database HA Setup
  5. SAP PAS and AAS Installation
  6. SAP ASCS/ ERS and Database High Availability Test

This blog is intended to provide step by step guide for IBM DB2 DB HA setup using SIOS Datakeeper and will skip steps 2,3, 5 for ASCS & Application server setup. Below links/documents can be used for HA setup of ASCS.

Preparations

  • Read the required Installation Guide, SAP Notes, SAP on Azure docs and download the installation media (SAP & SIOS).
  • Deploy the VM in Availability Zone and include them in Proximity Placement Group(PPG) as per the system architecture design and Choose Operating System as Windows Server 2019. Follow the reference architecture in this link.
  • Follow the SAP on Azure DB2 DB specific recommendations in this link.
  • Add data and log disks to DB VMs and it is recommended to use Premium or Ultra managed disk (locally redundant storage – LRS). Create the required drives in windows.
  • Join the VMs to the Domain.
  • Define Page File in Temp Disk (D Drive).
  • Check that necessary Ports (including ILB Probe ports) are open in Windows firewall and NSG.
  • Disable the Continuous Availability Feature in Windows using the instructions in this link.

IBM DB2 DB HA Setup

  • Define the Internal Load balancer (standard) to define virtual hostname of DB2 DB.

Create front-end IP, backend-pool, health probe, and loadbalancing rule for cluster role.

Following are the details of the reference setup.

Front-end IP Backend Pool Health probe port Load balancing rule

10.2.2.135

(Virtual IP – HA)

azwindb2db01 and azwintdb2db02

(VM hostname)

63035

 

Enable HA Port,

Enable Floating IP,

Idle Timeout (30 Minutes)

 

  • Install Failover Clustering on cluster Nodes.

This can be done in Server Manager -> Manage -> Add Roles and Features

  • In DB VM1, Create the Failover Cluster and
    • Add Node1 (azwindb2db01) to the cluster.
    • Run the validation checks
    • Define the cluster name
    • Complete the cluster creation.

If its Windows Server 2016, then update the cluster IP and start the cluster services.

Choose the Cloud Witness option in next screen and provide the Storage account details for cloud witness.  Ideally the Cloud Witness storage account should be in an Azure region that is totally independent of any infrastructure that is part of the cluster.  For example if a DB2 cluster was in US West zone 1 and zone 2, the Cloud Witness can be in US East region

    • Add Node2 to the WFSC cluster

 

  • Install SIOS Datakeeper on both the nodes.
    • Start the installation.
    • In the DataKeeper Service Logon Account Setup

Select Domain or Service account option.

Precreate the user (eg : DataKeeperSvc) and Add this user to local administrator group and restart the servers.

    • Install SIOS License.
  • Configure Datakeeper replication.
    • Create the Disk replication jobs for data and log drives. Ensure that replication is in ‘sync’ mode.
    • Choose ‘yes’ to auto-register the volumes to cluster.
    • Check the Disk Mirroring Jobs are running.
    • Verify the replicated disks are added to the cluster.
  • In the Windows DNS manager, create a DNS entry for the virtual host name of the DB2 DB instance.

The IP address that we assign to the virtual host name of the DB2 DB instance must be the same IP address that is frontend IP defined in Azure Load Balancer.

Select the check boxes “Create associated pointer (PTR) record” and “Allow any authenticated user to update DNS records with the same owner name”.

  • SWPM Setup on DB Node 1

(Will be referring to relevant SWPM screens)

    • Choose the DB2 Database Instance Install in High Availability Section
    • Provide the path of ‘sapmnt’ Profile directory which is part of HA File share.
    • Specify the DBSID and virtual DB hostname
    • Perform the installation based on domain users.
    • Provide the Drive details of data and log files.
    • Continue with the installation.
    • Installation halts to perform the manual setup of Node 2 and Cluster Setup. Perform the next steps to complete the cluster setup.
  • Preparing the Additional(2nd) Node for High Availability
    • Copy the file ..\ESE\win_ese.rsp from the RDBMS media(DVD) to a local directory.
    • Edit the file win_ese.rsp as follows :

Replace @DB2SOFTWAREPATH@ as follows:<local_drive_for_database_software>:\db2\db2<dbsid>\db2_software

Replace @DBSID@ with <DBSID>.

Replace @DOMAIN@ with <WIN_DOMAIN_NAME>, where <WIN_DOMAIN_NAME> is the Windows domain name of this domain installation.

    • Change to the local directory to which you copied file win_ese.rsp
    • Enter the following command to install DB2 software.

<Full_path_to_RDBMS_media_path>\ESE\image\setup /i en /l db2setup.log /m /u win_ese.rsp

    • Verify the installation logfile in the local directory that its completed successfully.
    • Install all required operating system users and groups on each cluster node.

Run the installer and choose -> Generic Installation Options -> IBM Db2 for Linux, UNIX, and Windows->Preparation->Operating System Users and Groups.

  • Setup the DB2 HA Cluster Resource.

The db2mscs utility makes the database aware of the cluster and create the resource in the WSFC. All the below commands/activities need to be executed on primary node.

    • Login to Primary DB as user db2<sid>.
    • Check whether the default DB2 copy is set to SAPDB2<DBSID> using the db2swtch utility command ‘db2swtch -l’ (small L)
    • Make sure user db2<sid> is part of domain admin and local admin group on both nodes.
    • In the group DB2ADMNS_<SID> in Active Directory Users and Computer in AD , add the cluster name as member (eg.: db2-pt1-cls)

Restart both cluster nodes.

    • Edit the db2mscs.ese file as follows:

Copy the file from “I:\db2\db2pd1\db2_software\cfg” (db2 install location) and edit the content. Below is the content for reference.

#

# Global section

#

DB2_INSTANCE=DB2PT1

DB2_LOGON_USERNAME=contoso234.com\db2pt1

DB2_LOGON_PASSWORD=<password>

CLUSTER_NAME=db2-pt1-cls

#

# MSCS Group for Node 0

#

GROUP_NAME=DB2PT1Group

DB2_NODE=0

IP_NAME=DB2IPPT1

IP_ADDRESS=10.2.2.135

IP_SUBNET=255.255.255.224

IP_NETWORK=Cluster Network 1

NETNAME_NAME=DB2NetNamePT1

DISK_NAME=DataKeeper Volume M

DISK_NAME=DataKeeper Volume N

INSTPROF_DISK=M:\db2\db2pt1\DB2PT1

Additional remarks about the db2mscs.ese.

      • Remove the node 1 section in the db2mscs.ese file.
      • DB2_INSTANCE should be DB2<SID>.
      • DB2_LOGON_USERNAME should be db2 admin OS user name(domain user)
      • CLUSTER_NAME is the windows failover cluster name
      • GROUP_NAME should be DB2<SID>Group
      • IP_NAME field value should be DB2IP<SID>
      • IP_ADDRESS is the frontend IP address of the ILB
      • IP_SUBNET can be checked from ipconfig command.
      • IP_NETWORK value should be cluster network name which is usually ‘Cluster Network 1’
      • NETNAME_NAME value should be DB2NetName<SID>
      • NETNAME_VALUE is for virtual hostname of the DB2 resource. It corresponds to frontend IP defined in the ILB. Application servers will connect to DB using this hostname.
      • NETNAME_DEPENDENCY value must be same as field IP_NAME
      • In the DISK_NAME field, match the volume name as shown in the failover cluster.
      • INSTPROF_DISK value should have value of Instance directory.
    • Open the Command Prompt and run as administrator.
    • Go to the path of db2mscs.ese file.
    • Run the command to create DB2 Cluster Resources.

db2mscs -f db2mscs.ese -d db2mscs.out

Log file db2mscs.out will be created in the current directory.

    • Verify that DB2 Cluster Resources are setup and running successfully.
  • Update the Probe Port

Update DB2 Cluster Resources so that it can be monitored by Azure Standard Load Balancer and Virtual hostname/IP assignment works.

    • Check the current probe port assignment. Value must be 0.

Get-ClusterResource “DB2IP<SID>” | Get-ClusterParameter

    • Update the Probe Port(as defined under health probe in ILB) in the cluster configuration

$ClusterNetworkName = “Cluster Network 1”

$IPResourceName = “DB2IP<SID>”

$ILBIP = “<Front-end IP of ILB>”

Import-Module FailoverClusters

Get-ClusterResource $IPResourceName | Set-ClusterParameter  -Multiple @{Address=$ILBIP;ProbePort=<Health probe port>;SubnetMask=”255.255.255.255″;Network=$ClusterNetworkName;EnableDhcp=0}

Stop and Start the DB2 Cluster Role.

    • Verify the Probe Port is updated in the cluster configuration.
  • Add ‘db2dump’ Fileshare to the DB2 cluster Role.
    • Create db2dump folder in the shared/replicated drive (eg.: N:\db2\<SID>\db2dump)
    • Create a share for this folder. Right click on DB2 Cluster Role and choose Add Fileshare.
    • Verify that SMB share is attached to the DB2 Cluster Role.

 

Make sure to Stop and Start the DB2 Cluster role for changes to take effect.

 

  • Continue with SAP SWPM Installation
    • Pause the Node 2 in the Failover Cluster Manager so that failover to 2nd Node can’t happen during the installation process.
    • Continue with the Database Installation in Node 1 in SWPM. Earlier we were stopped at around 41%
    • Once the Installation is complete, bring the Node 2 ‘online’ in the Failover Cluster Manager.
    • On the 2nd cluster nodes, update file etc\services with the following line:

sapdb2<DBSID>               <service number>

Use the same value as on the first node of the DB cluster

    • Continue with SAP PAS & AAS Installation.

Failover Cluster Testing

Failover testing needs to be performed for ASCS Cluster and Database cluster of the SAP environment.

There are mainly 2 kinds of HA testing needs to be performed:

  1. Planned – In case of planned unavailability of the one the node of the cluster pair. This can be tested by stopping the services on the node and verifying that SAP services are available to users. We can also manually move the Cluster Role from one Node to another in the Failover Cluster Manager
  2. Unplanned – This can be tested by crashing the VMs and removing the storage from VMs. In windows environment, VM crashing can be achieved by the tool ‘notmyfault’.

References

Assigned tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Ashish Tak
      Ashish Tak

      Very informative blog. Thank you.

      Author's profile photo Louise Wong
      Louise Wong

      Well done

      Author's profile photo Jason Aw
      Jason Aw

      Great blog. Very detailed and useful information for anyone to follow.

      Author's profile photo AMIT Lal
      AMIT Lal

      Very Nicely articulated! Thanks.

      Author's profile photo Thomas Rech
      Thomas Rech

      Good article and interesting approach to integrate SIOS Datakeeper and to present this as shared  disk.
      I need however add some comments from a Db2 point of view. With Db2 - and other databases as well - you may see a so called torn page issue or partial page write during Db2 page writes that are broken into storage block sizes during a crash.
      This may happen on every storage in very rare cases but the chance may be increased for this issue with disk replication by writing asynchronously or in different order on the target while a link error occurs between the source and target storage. For detail, the following technote does provide some background information: Recovery options for data page corruptions (ibm.com)

      To mitigate the still small probability of this to happen, you may check if the solution preserves the write order and minimizes break page writes due to replication.
      In addition, ensure to have a valid backup and log files available for recovery - but this is best practice anyhow.

      Author's profile photo Jason Aw
      Jason Aw

      Hi Thomas. I work for SIOS so I can help to address the concern raised.

      The SIOS data replication solution actually replicate data blocks in sequence from the I/O writes to the disk, that gets picked up from the SIOS filter driver. This is the same for both synchronous and asynchronous replication modes for block-level changes to be written in sequence to the target side. So potential for page break writes are minimized.

      So, according to the IBM doc, is that it may start a “crash-recovery” when the DB restarts. SIOS ensures a state that is “crash consistent” for database applications like Db2, meaning services can be recovered, although certain data like those in-memory on data page might inadvertently be lost and possibly in certain situations a crash.

      The purpose of the solution is to provide the least amount of data loss by replicating in real-time, so this is as low a RPO that one can get.

      Of course periodic backups will still need to be done, in a last-resort scenario the entire database could be restored in a catastrophic failure (backups should be stored in a different location).