Enterprise Resource Planning Blogs by Members
Gain new perspectives and knowledge about enterprise resource planning in blog posts from community members. Share your own comments and ERP insights today!
cancel
Showing results for 
Search instead for 
Did you mean: 
srikanth_sandha
Explorer

Scenario:


With the growing high bandwidth solutions with various clouds - GCP (Interconnect), Azure(Express Route) and Amazon (Direct Connect) - S/4 Downtime optimized Conversion/Migration option which is not available with "system move" can be executed between On-Prem and Cloud(which is a system move) with little efforts by testing your network to check the feasibility.


Caveat: SAP has to bless this during your planning sessions or do it at your own risk.


How To :




  •  SUM to be started from the Source system. During your options selection, you do not Select "System Move"  but the target Database will be the one you install on Cloud and is reachable from On-Prem.

  •  You install SAP  on Target Cloud to connect to Target DB, but SAP will be down until SUM post step completion on the source. You will start Target SAP and finish the remaining steps like any post transports/Embedded Fiori ..etc from the Target SAP  server.

  • As an Alternative, you can build a source application server on Cloud which connects to Source DB and start SUM from there, but you will be creating a memory pipe between On-Prem and Cloud which is long-distance and might be a little risky. However,  you get some downtime improvements with this approach compared with starting SUM  from Source. Risk and downtime are trade off's for both options.


           


In the above picture - ping time or Network latency 11ms(as an example) is a simple ping test with 1450 bytes between On-Prem and Cloud. This value changes with block size and Band Width usage.


 

Pros:




  • R3loads memory pipeline will be on the same host. Which ensures data is consistently communicated between their respective memory blocks. 

  • Parallel SQL requests execute from On-Prem to Cloud(Data Inserts/updates..etc), which is not something new with fast-changing architectures, where DB's are on Cloud and Client's execute from various locations to perform various operations on Cloud DB. This Conversion/Migration will be the same scenario, where we will be sending SQL queries/ABAP requests from the SHD instance in source to HANA DB on Cloud.

  • Conversion commands run through SHD Instance Target Kernel in On-Prem, but actual conversion happens in Cloud HDB. 

  • If you are using the method where you execute conversion in On-Prem and use HSR to replicate(migrate) to cloud after conversion. This downtime optimized method can be used for building Performance testing or QP system directly to cloud which results in saving additional costs for non-prod On-Prem servers. only HSR is additional step for production.


  • This procedure not only serves the purpose for S4HANA Conversion/Migration, but does valid for DB migrations from AnyDB to HANA/Cloud.




Cons:




  •  It might be best suited for a system with 5TB-7TB. Based on throughput and downtime. But try with your own analysis.


Pre-Test:


Below are few steps that might be handy in making a decision whether to use this method or not.


 

  •  Bandwidth vs Ping time or RTT (On-Prem and Cloud) - With multiple hops in the picture, you might  see spiked RTT but less used bandwidth. With Parallel SQL process that needs to travel via On-Perm to Cloud through network - increasing this process to maximum utilize the bandwidth will benefit and lessen ping time role in S/4 Conversion. 


SAP Standard values


Good value               : Roundtrip time <= 0.3 ms


Moderate value         : 0.3 ms < roundtrip time <= 0.7 ms


Below average value  : Roundtrip time > 0.7 ms


 

*** In one of NIPING testing we got around 11 ms ping time, so RTT might be a little more with the Processing times at DB. These ping values compared with SAP standard are way higher, but benefit of parallelism might in some cases overrun disadvantages of slow latency situations. And the lessen learned from testing defined steps below, is to validate if we can trade off b/w cost vs downtime and see if we can benefit with using high bandwidth cloud network tools to reduce costs.


 

  •  Calculate Bandwidth Delay Product  = Bandwidth * Latency (we can use ping time here)        Increase tcp send and receive buffer parameters on both source and target net.ipv4.tcp_wmem, net.ipv4.tcp_rmem accordingly. Refer SAP Note 2382421 for more information. Because our source is a client and trying to send a good amount of data, this should be done on the Client-side as well(source).


You can use URL https://www.switch.ch/network/tools/tcp_throughput/  to calculate(if needed). This decreases the time of waiting for acknowledgment during send/receive buffer full situations and ensures maximum bandwidth usage.


 

  •   Perform Network stability test between OnPrem and Cloud - as per SAP note -500235 - Network Diagnosis with NIPING- for 24 hours. Check if any data loss.


 

  •   Perform Network throughput testing as per the above note with various block sizes- Attached python program that will create NIPING parallel threads and talk to the server that is started at target. I am not a python expert, but it gave me errors with different python versions. so correct data types as needed. But this program served my purpose as I only needed to collect stats from Cloud Network metrics.


 Tested with GCP Dedicated Network - for 100 parallel NIPING throughput test - got around 140 - 150 MBps - (Use Cloud network metrics to get this data) which will approximate to 500 - 540 GB/hour. But, These threads can be increased or decreased further according to Target HANA DB resource and Bandwidth availability. This test should be done after changing Bandwidth Delay Product parameters.  You can do similar testing with IPERF3 but the results were a little different for me.



                                    sample testing data capture.


 

 
import subprocess
import os
import time
import csv
TargetServer = input("Enter server hostname: ")
TotalParallel = int(input("Enter total parallel to be tested: "))
PacketSize = int(input("Enter Buffer size to be tested: "))
DelaySet = input("Do you want to run with a Delay Y or N :")
DelayValue = 0
LoopSet = input("Do you want to run this as loop Y or N :")
LoopValue = 3
if LoopSet == 'Y' :
LoopValue = int(input(" Enter number of times to Loop :"))
if DelaySet == 'Y' :
DelayValue = int(input(" Enter Delay Value :"))


ParallelCommand = "niping -c -H {} -B {} -L {} -D {}".format(TargetServer,PacketSize,LoopValue,DelayValue)

FileCommandTR = "cat FileOutPut.txt | grep tr2"
FileCommandAVG = "cat FileOutPut.txt | grep av2"
print(ParallelCommand)
Pprocesses = []
OutputProcess = []
f = open("FileOutPut.txt", "w")

for IteLoop in range(TotalParallel):
Pprocesses.append(subprocess.Popen([ParallelCommand], stdout=f,shell=True))

for p in Pprocesses:
if p.poll() is None:
p.wait()

f.close()

g = open("tr2file.txt", "w")
subprocess.Popen([FileCommandTR], stdout=g, shell=True)
g.close()

h = open("avgfile.txt", "w")
subprocess.Popen([FileCommandAVG], stdout=h, shell=True)
h.close()

commandAVGP = "cat avgfile.txt | awk '{sum+=$2} END {print(sum)}'"
commandAVG1=subprocess.Popen([commandAVGP],stdout=subprocess.PIPE,universal_newlines=True, shell=True)
time.sleep(5)
AVG1 = commandAVG1.communicate()[0]
AVG = float(AVG1)
TotalAVG = (float(AVG)/TotalParallel)
print("Total avg time for: {} Parallel process: {}".format(TotalParallel,TotalAVG))

commandTR2P = "cat tr2file.txt | awk '{sum+=$2} END {print sum}'"
commandTR21=subprocess.Popen([commandTR2P],stdout=subprocess.PIPE,universal_newlines=True, shell=True)
TR2 = float(commandTR21.communicate()[0])
TotalTR2 = TR2/TotalParallel
print("Total avg Throughput for: {} Parallel process: {}".format(TotalParallel,TotalTR2))

 


Start NIPING on the server. Then run this program from Client.

Python Program Input:


   


Python Program Output: 


Increase in average times is what tell us the delay or RTT, however if your throughput do not impact  with the increase in parallel process then there is still scope of increasing more parallel process from a network perspective. However you need to check or if necessary increase DB resources  accordingly for short duration during go-live to maximize benefit from parallelism and to reduce downtime.


Perform this method with SAP blessings or at your own risk. And, above mentioned steps are just for reference, but do explore and validate for your case respectively.

1 Comment
Labels in this area