Do not complain about insufficient migration rate if you use a 1 GBit network card!

Do your Mathematic Obligations and check your network for performance prior to DMO.


Boris Rubarth

Product Management Software Logistics, SAP SE




P.S. Sorry, yes, the blog above is short and rude, but I had to draw your attention on the network side of housekeeping:

You have to check the network performance before starting with DMO: check the througput between source DB and PAS (on which SUM and R3load processes are running), and between PAS and target DB. One network tool that I am aware of is iPerf: “iPerf3 is a tool for active measurements of the maximum achievable bandwidth on IP networks.”  (http://iPerf.fr). But even ftp can provide a first insight: transfer a large file, and check the throughput. Measurements like this can detect an wrong configuration of network cards, a transfer limitation due to firewalls, or other hurdles.

And network card size matters! Using a 1 Gbit network card means a maximum throughput of 439 GB / hour theoretically, practically ~ 350 GB / hour. [The math is: 1 Gbit card means 1000 MBits per second, this is 125 MByte per second, this is 429 GB per hour.]  DMO can do more, if you let it  … see Optimizing DMO Performance So you should rather use a 10 Gbit network card.

Feel free to add your favorite network tool as comment for this blog.

To report this post you need to login first.

7 Comments

You must be Logged on to comment or reply to a post.

  1. Jose Eduardo Santos Pereira

    Hello Boris,

    Thanks for information. Maybe can be used the niping to test the network as describe on SAP note 500235 – Network Diagnosis with NIPING.

    Recently, I make some DMO benchmark with links 1Gbi and 10Gbi. On 10Gbi Network with a migration from Oracle (database ~1TB) to HANA, I had a throughput of the 155 GB/hour with 288 R3load process. I know that infrastructure and parameters can influence on throughput, so, the doubt is: What is the best throughput that you know? Is it possible get throughput up to 800 GB/h or more?

    Best Regards,

    Jose Pereira

    (0) 
    1. Boris Rubarth Post author

      Hello Jose,

      thank you for your contribution, and your hint.

      Concerning your migration rate: using a 10Gbi network and 288 R3load processes, I would definitely expect a higher value than only 155 GB/hour.

      We have customer reports with up to 1000 GB/hour, and customers using 700 R3load processes and more.

      Best regards, Boris

      (0) 
      1. Jose Eduardo Santos Pereira

        Hello Boris,

        I think I was conservative in the number of R3load process. In my case I had 72 CPU (ST06) and calculated 04 R3load process per CPU, because the SAP Central Instance, Oracle database and SUM/DMO were on same host.

        I read the SAP Note 1616401 – Parallelism in the Upgrades, EhPs and Support Packages implementations: “…Since R3load utilizes small CPU processing, you can safely enter a significant number for this parameter: 3 to 5 times the number of CPUs, being “5” the theorical top limit.” and used 04 R3load/CPU.

        Next time I will increase the number of R3load process per CPU.

        Best Regards,

        Jose Pereira

        (0) 
        1. Boris Rubarth Post author

          Hello Jose,

          I see … the limit is the performance of the AS, and if any network increase is still possible. So both will have to be monitored.

          BR, Boris

          (0) 
  2. Ashutosh Chaturvedi

    Hi Boris,

    We are having a database of 3 TB and while doing the migration/upgrade by SUM DMO , it is taking 45 hours downtime. We are checking the network throughput in our dry run 2

    I have analyze the log and SUM DMO is using “equidistant algorithm to split the table” . We have used the duration file to minimize the downtime and it has come down to 30 hours and still it is not getting accepted .

    is it something which we can do so that downtime will come down to 10 hours by splitting the tables further and feeding that information to SUM DMO. I think it is possible  but from last two days i am reading logs of the previous run and found that WHR files are getting created for splitting purpose.

    Is there any file where we can mention/edit , which table need to be splitted further whose sizes are above 50 GB.

    is it something , that we can change the splitting algorthim used by the SUM DMO. or i prepare file splittable.txt and feed it to SUM DMO.

    if the above two possibility are applicable , can you please suggest how to achieve this.

    It will be of great help.

    With Regards
    Ashutosh Chaturvedi

    (0) 
  3. Maxim Afonin

    Well since network topic is now a discussion, you should also always keep in mind that RTT is a limit factor of your maximum Throughput:
    RCV buffer size / RTT = Max TCP throughput (any calculator from web cab be used for this http://wintelguy.com/wanperf.pl)

    So this statement:  [The math is: 1 Gbit card means 1000 MBits per second, this is 125 MByte per second, this is 429 GB per hour.] 
      is not 100% true story, It will around 945Mbit in case of 3 ms latency…

    An FTP will not allow you to simulate several streams. So if you want to have real picture you should use iperf (as it was mentioned initially)

     

    (0) 
  4. Ambarish Satarkar

    Hi Boris,

     

    Thank you for nice blog and explanation.  We recently did DMO run on ECC EHP6 on SQL database with target as ECC EHP8 on hana. The source database size is approx . 12 TB with 7 TB application table data. We are using latest SUM tool and kernel executables.

    We are concerned about row count(*) time, which was 29.5 hours. 99% row count finished in 20 minutes and next 1% ( 2 buckets with set of 500 tables each) took 29 hours.

    As row count is still part of downtime, we need to understand why it took such a long time?

    There were no errors during row count also no bucket failures observed.

    We completed application table data migration in 34 hours with approx 200 GB/hour speed.

    We are using 10 Gbit network card and no network issues observed.

    As per our previous experience, DMO row count for similar size “oracle” DB finished in less than an hour.

    What is your view and suggestion to avoid this long runtime of row count(*) operation?

    Thanks,
    Ambarish

     

    (0) 

Leave a Reply