There are many blogs and documents available describing how to best migrate your SAP system to HANA. This isn’t one of those.
What this is, on the other hand, is a few observations, and some lessons learned, when migrating an ERP system to new hardware using the R3load, aka Export/Import, method of system copy. The overall process is well-described in the official System Copy Guide and in numerous documents available on SCN, so I won’t go into that detail here. What is not well-described, however, is how to go about choosing some of the parameters to be used during the export and import — specifically, the number of parallel processes. First, however, let’s address some background confusion prevalent among many customers.
- Export on Old Hardware
- Import on New Hardware
- Max Degree of Parallelism
- Minimal Logging During Import
- Adjusting Parallel Processes During Import
Homogeneous or Heterogeneous?
One point that seems to come up, time and time again, in questions posted to SCN is about whether a homogeneous system copy is allowed in the case of a database or operating system upgrade.
The answer is yes.
If you are upgrading your operating system, for instance from Windows Server 2003 to Windows Server 2012 R2, you are not changing your operating system platform. Therefore, this remains a homogeneous system copy (yes, you should be using system copy as part of a Windows operating system upgrade, as an in-place upgrade of the OS is not supported by either Microsoft nor SAP if any non-Microsoft application (i.e., your SAP system) is installed, except in special circumstances which generally do not include production systems).
If you are upgrading your database platform, for instance from SQL Server 2005 to SQL Server 2012, you are not changing your database platform, and so, again, this is a homogeneous system copy. It is possible and acceptable to upgrade SQL Server in place, although you might consider following the same advice given for a Windows OS upgrade: export your SAP system (or take a backup of the database), then do a clean, fresh install of the OS and/or DBMS and use SWPM to re-import your database while reinstalling SAP.
You are only conducting a heterogeneous system copy if you are changing your operating system, database platform, or both, i.e. from Unix to Windows or Oracle to SQL Server. Or migrating to HANA.
- Homogeneous: source and target platforms are the same (although perhaps on different releases).
- Heterogeneous: source and target platforms are different.
Export/Import or Backup/Restore?
The next question that often arises is whether an Export/Import-based migration or Backup/Restore-based copy is preferred. These methods sometimes go by different names:
Export/Import is sometimes called R3load/Migration Monitor based or Database Independent (in the System Copy Guide). Because this method is not reliant on database-specific tools, it is the only method that can be used for heterogeneous copies. However, it can also be used for homogeneous copies.
Backup/Restore is sometimes called Detach/Attach, or Database Dependent (in the Guide), or even just Homogeneous System Copy (in the SWPM tool itself). This method relies heavily on database-specific tools and methods, and therefore it can only be used for homogeneous copies.
If you are performing a heterogeneous system copy, then you have no choice. You must use the Export/Import method. If you are performing a homogeneous system copy, you may choose either method, but there are some definite criteria you should consider in making that choice.
Generally speaking, for a homogeneous system copy, your life will be simpler (and the whole procedure may go faster) if you choose the Backup/Restore method. For a SQL Server-based ABAP system, for instance, you can make an online backup of your source database without having to shut down the SAP system, which means there is no downtime of the source system involved. Copy the backup file to your target system, restore it to a new database there, then run SWPM to complete the copy/install. This is great when cloning a system for test purposes. Of course, if the goal is to migrate the existing system to new hardware, then downtime is inevitable, and you certainly don’t want changes made to the source system after the backup.
The Detach/Attach variant of this method is probably the fastest overall, as there is no export, import, backup, or restore to be performed. However, downtime is involved. You shut down the source SAP system, then use database tools (SQL Server Management Studio, for instance), to detach the database. Then you simply copy the database files to your target system, use database tools again to attach the database, then run SWPM on the target to complete the copy/install.
By comparison, the Export/Import method involves shutting down the source SAP system, then using SWPM to export the data to create an export image (which will likely be hundreds of files, but will also be considerably smaller than your original database), then using SWPM again on the target system to install SAP with the export image as a source. Lots of downtime on the source, and generally speaking a more complex process, but much less data to move across the network.
Obviously I am a big fan of using the Backup/Restore or Detach/Attach database-dependent method for homogeneous system copies, and in most cases, this is what I would advise you to choose.
When You Should Choose Export/Import
There is one glaring disadvantage to the Backup/Restore method, however. This method will make an exact copy of your database on your target system, warts and all. Most of the time, that isn’t really an issue, but there are circumstances where you might really wish to reformat the structure of your database to take advantage of options that may not have been available when you originally installed your SAP system, or perhaps to make up for poor choices at the time of original install that you would now like to correct. Well, this is your big opportunity.
What are some of these new options?
- Perhaps you are migrating to new hardware, with many more CPU cores than available on the old hardware, and you see this as a prime opportunity to expand your database across a larger number of files, redistributing the tables and indexes across these files, thus optimizing the I/O load. Backup/Restore will create a target database with the same number of files as the source, with the tables distributed exactly as they were before. You can add more files, but your tables will not be evenly redistributed across them. Export/Import, on the other hand, doesn’t care about your original file layout, and gives the opportunity to choose an entirely new file layout during the import phase.
- Perhaps you are upgrading your DBMS and would like to take advantage of new database compression options. Yes, you can run MSSCOMPRESS online after upgrading to a platform that supports it, but this can have long runtimes. SWPM will, however, automatically compress your database using the new defaults during the import, assuming your target DBMS supports these defaults, so you can achieve migration and compression in a single step. Compression does not add any extra time to the import.
Parallel Processing During Export and Import
At the beginning of the export and the import in the SWPM tool, there is a screen where you are asked to provide a Number of Parallel Jobs. The default number is 3. This parameter controls how many table packages can be simultaneously exported or imported, and obviously it can have a huge impact on overall runtime. The System Copy Guide does not give much in the way of advice about choosing an appropriate number, and other documentation is sparse on this topic. Searching around SCN will bring up some old discussion threads in which advice is given ranging from choosing 1 to 3 jobs per CPU, and so forth, but it is difficult to find any empirical data to back up this advice.
This is an area needing more experimentation, but I can share with you my own recent experience with this parameter.
Export on Old Hardware
I exported from two different QAS machines, both using essentially identical hardware: HP ProLiant DL385 Gen1 servers, each with two AMD Opteron 280 2.4 GHz Dual-Core CPUs (a total of 4 cores, no hyperthreading) and 5 GB of RAM, running Windows Server 2003 and SQL Server 2005. I think you can see why I wanted to get off these machines. The application is ERP 6.04 / NetWeaver 7.01 ABAP. The databases were spread across six drive volumes.
Export 1: 3 Parallel Processes on 4 Cores
The first export involved a 490 GB database, which SWPM split into 135 packages. I hadn’t yet figured out what I could get away with in terms of modifying the number of export jobs involved, so I left the parameter at the default of 3. The export took 8 hours 25 minutes. However, the export package at the end was only 50.4 GB in size.
Export 2: 6 Parallel Processes on 4 Cores
By the time I got around to the second export I had learned a thing or two about configuring these jobs. This time the source database was 520 GB, and SWPM split it into 141 packages. I configured the export to use 6 processes. During the export I noted that CPU utilization was consistently 90-93%, so this was probably the maximum the system would handle. This time the export took 6 hours 28 minutes, a two-hour reduction. As most of the time was spent exporting a single very large table in a single process, thus not benefiting at all from parallelization, I probably could have reduced this time considerably more using advanced splitting options. The resulting export package was 57.6 GB in size.
Import on New Hardware
The target machines were not identical to each other, but in both cases the target OS/DBMS was Windows Server 2012 R2 and SQL Server 2012. Both databases would be spread across eight drive volumes instead of the previous six.
Import 1: 3, then 12, then 18 Parallel Processes on 12 Cores
The target of my first export, and thus first import, was an HP ProLiant BL460c Gen8 with two Intel Xeon E5-2630 v2 2.6 GHz six-core CPUs with hyperthreading and 64 GB of RAM. Yeah, now we’re talking, baby! Twelve cores, twenty-four logical processors, in a device barely bigger than my laptop.
At the start of this import, I still didn’t really have a handle on how to configure the parallel jobs, so as with the matching export, I left it at the default of 3. After all the DEV system I had migrated earlier didn’t take that long — but the DEV system had a considerably smaller database.
Five hours into the import I realized only 60 of the 135 packages had completed, and some quick table napkin calculations indicated this job wasn’t going to be finished before Monday morning when users were expecting to have a system. I did some research and some digging and figured it would be safe to configure one import job per core. However, I really didn’t want to start all over from scratch and waste the five hours already spent, so with a little more experimentation I found a way to modify the number of running jobs while the import was in process, with immediate effect. More on this in a bit.
So first I bumped the number of parallel jobs from 3 to 12, and immediately I saw that the future was rosier. I monitored resource usage for a while to gauge the impact, and I saw CPU utilization bouncing between 35% to 45% and memory utilization pegged at 46%. Not bad, it looked like we still had plenty of headroom, so I again bumped up the processes, from 12 to 18. The overall import job took another impressive leap forward in speed, while CPU utilization only rose 2-3% more and memory utilization didn’t change. It’s entirely possible this machine could have easily handled many more processes, but I had seen an anecdotal recommendation that the parallel processes should be capped at 20 (I’m not sure why, but there is some indication that much beyond this number and the overall process may actually go slower — but again, that may only be true for older hardware), and in any case all but one import package finished within minutes after making this change.
The final package took an additional three hours to import by itself. This was PPOIX, by far the largest table in my database at 170 GB (I have since talked to Payroll Accounting about some housecleaning measures they can incorporate), and thus without using table splitting options this becomes the critical path, the limiting factor in runtime. Still, I had gained some invaluable experience in optimizing my imports.
My new database, which had been 490 GB before export, was now 125 GB after import.
Import 2: 12 Parallel Processes on 8 Cores
The target of my second export, and thus second import, was also an HP ProLiant BL460c, but an older Gen6 with two Intel Xeon 5550 2.67 GHz quad-core CPUs with hyperthreading and 48 GB of RAM. Maybe not quite as impressive as the other machine, but still nice with eight cores, sixteen logical processors.
Based upon my experience running 18 processes on 12 cores, a 1.5:1 ratio, I started this import with 12 processes. I noted CPU utilization at 60-75% and memory utilization at 49%. Still some decent headroom, but I left it alone and let it run with the 12 processes. Despite seemingly matched CPU frequencies, the Gen6 really is not quite as fast as the Gen8, core for core, due to a number of factors that are not really the focus of this blog, and to this I attributed the higher CPU utilization with fewer processes.
This time, 140 of my 141 packages were completed in 2 hours 4 minutes. Again, PPOIX consumed a single import process for 6-1/2 hours by itself, in parallel with the rest of the import, and thus the overall import time was 6 hours 32 minutes. Next time I do this in a test system, I really will investigate table splitting across multiple packages, which conceivably could get the import time down to not much more than two, perhaps two and a half hours, or perhaps even much less should I be willing to bump up the process:core ratio to 2:1 or even 3:1.
The source database, 520 GB before export, became 135 GB after import on the target. Yeah, I’m quite liking this compression business.
Max Degree of Parallelism
In addition to adjusting the number of parallel jobs, I temporarily set the SQL Server parameter Max Degree of Parallelism (also known as MAXDOP) to 4. Normally it is recommended to keep MAXDOP at 1, unless you have a very large system, but as explained in Note 1054852 (Recommendations for migrations using Microsoft SQL Server), the import can benefit during the phase where secondary indexes are built with a higher level of parallelism. Just remember to set this back to 1 again when the import is complete and before starting regular operation of the new system.
Minimal Logging During Import
The other important factor for SQL Server-based imports is to temporarily set trace flag 610. This enables the minimal logging extensions for bulk load and can help avoid situations where even in Simple recovery mode the transaction log may be filled. For more details see Note 1241751 (SQL Server minimal logging extensions). Again, remember to remove the trace flag after the import is complete.
Adjusting Parallel Processes During Import
During Import 1 I mentioned that I adjusted the number of processes used from 3 to 12 and then to 18 without interrupting the import. How did I do that? There is a configuration file that SWPM creates using the parameters you enter at the beginning called import_monitor_cmd.properties. The file can be found at C:\Program Files\sapinst_instdir\<software variant>\<release>\LM\COPY\MSS\SYSTEM\CENTRAL\AS-ABAP (your path may be slightly different depending upon options you chose, but it should be fairly obvious). Within the properties file you will find the parameter jobNum. Simply edit this number and save the file. The change takes effect immediately.
How many parallel processes to choose is not a cut-and-dried formula. Generally, it seems that a ratio of processes to cores between 1.5:1 and 3:1 should be safe, but this will depend on the speed and performance of your CPU cores and general system hardware. On the Gen1 processors, 1.5:1 pegged them to over 90% utilization. On the Gen8 processors, 1.5:1 didn’t even break 50%, while the Gen6 fell somewhere in between. The only way to know is to test and observe on representative hardware.
There is also a memory footprint for each parallel process, but with anything resembling modern hardware it is far more likely you will be constrained by the number of CPU cores and not the gigabytes of RAM. Still, a number I have seen mentioned is no more than 1 process per 1/2 GB of RAM.
I have seen a suggestion of a maximum of 20 processes, but the reasons for this suggestion are not clear to me, and I suspect this number could be higher with current hardware.
If you have one or more tables of significant size, it is worthwhile to use the package splitter tool (part of SWPM) to break them up into multiple packages so that they can benefit from parallelization.
Thanks for following along, and hopefully you will find the above useful. If you have your own experiences and observations to add, please do so in the comments.