Configuration Screen of SUM : Parameters for procedure.
Ok, So the Problem was on the Process count to use for our updates and upgrades. I turned to many fellows in my touch, All of them had some calculations of random andarbitrary nature. I was not convinced. We had to have something concrete for each and every process type which could be used for direct calculations. Tried multipleruns on a Sandbox, Referred to many notes, Initiated an OSS Incident, and also turned to inhouse experts of Database and SAP.
The final results provided reason for the random arbitrary nature of the view taken by my colleagues. You can’t have something conclusive like (Number of CPUs X 1.3 = R3trans processes to use), although a lot of industry veterans do so. What one can do is fall into the ‘Thought process’ of researching, tuning, observing, andtesting.
One of things that I found myself in great need of but missing was a good SCN blog on the topic. There were tidbits here and there, but hardly any good guidance.
The reason I initiate this blog and discussion is just that : To get thoughts from any and all, so the end page is an ever evolving starting point for everyone at the above screen of SUM for their respective SP/EHP/Release Upgrade.
Lets discuss process by process the thought process I used:
1. ABAP Processes :
Pretty Straightforward. Configure according to BGD processes available in the main system. Make sure enough is left for the Normal jobs users have to run. For downtime, you can use the maximum available. As per the SUM Guide, the returns stagnate after a value of 8. So below is what I used for system with 10 BGD available:
UPTIME : 6
DOWNTIME : 10
Could have increased the BGD in the system, but since the value above 8 should not have had much impact, above counts seemed optimal to me.
2. SQL Processes :
This part looks simple, but was the trickiest for me. Appropriately sizing this can do good for DBCLONE and PARCONV_UPG Phases. But size too large and you may experience frequent deadlocks in various phases, logging space full errors, archiver stuck, or performance severely impacted.
The Problem in my case, when using nZDM with very high SQL count was – “Transaction Log is full” – DB2 Database running out of the logging space. If you are working with a database like DB2 – where you have “Active logging space” constrained by DB parameters, make sure to size this process count small – Too many parallel SQL statements and logging space will fill up quick resulting in the aforementioned error which can only be bypassed by decreasing the count. To unthrottle, increase logging space or Primary/Secondary logs. Also, the log archiving has to be fast with plenty of buffer space in the archive directory.
As for the count, if one can take care of logging space and log archives, the next step is CPU. Different databases may slightly differ when dealing with execution of SQL in parallel. But core concept remains the same. More CPUs Help. Once you have a number, like 8 cores in my example, You next need to finalize the degree of parallelism (DOP – Oracle Term) – The number of parallel threads each CPU will be executing. For example, if 16 SQL Processes would have been used in my case – 2 threads would be executing per CPU – A choice I didn’t took as I wanted minimal impact on the productive operation of the system during the uptime phases.
Referring to the standard documentation of Oracle and DB2 databases – what I noticed was that the default and recommended DOP is 1-2 times the number of online CPUs. Also, the return is stagnated after increase to a particular number, after which the negative effects (Performance deterioration) increase as usual but returns are minimal.
After increasing the logging space, taking enough archiving space directory, following is the number I used for 8 CPUs.
UPTIME : 8 (DOP=1)
DOWNTIME : 12 (DOP = 1.5) Will make this 16 in the next system.
DBCLONE done in couple hours with above – Good for me.
4. R3trans Processes :
So the big one now. This process count has the biggest impact. TABIM_UPG, SHADOW_IMPORTS, DDIC_UPG – Phases with biggest contribution to runtime/downtime – go faster or slower based on how much this is tuned. The below KBA is the first step to understand how tp forks these during imports. There is a parameter “Mainimp_Proc” which is used in the backend to control the number of packages imported in parallel and the below KBA explains just that – The entire concept.
1616401 – Understanding parallelism during the Upgrades, EhPs and Support Packages implementations
1945399 – performance analysis for SHADOW_IMPORT_INC and TABIM_UPG phase
Now, how to tune it. This was one the most confusing ones. There are notes which say to keep it equal to number of CPUs (Refer Above notes – They say this). The SUM Guide seems to love the value of 8 (The Value larger than 8 does not usually decrease the runtime <sic>). You also have to keep in note the memory. A 512 MB of RAM per R3trans Process seems a good guideline. The end result for me was the same process count as SQL Processes :
UPTIME : 8
DOWNTIME : 12
One other thing still left unexplored, but next on my radar, is playing with “Mainimp_Proc”. The below link talks about changing that using parameter file
TABIMUPG.TPP. Since this controls the number of TP Processes, tuning this should be done after results from one system. Readings there in the logs can help here.
5. R3Load processes :
For EHP Update/SPS Update, I dont think this plays any part. From what I understood, this is relevant majorly to the Release Upgrade. Anyways, this one was a bummer. I didn’t seem to find any helpful documentation on R3load relevant for the upgrades specifically . However, Communicating with SAP over an OSS. The below guideline was received and used :
“There is no direct way to determine the optimal number ofprocesses. A rule of thumb though is to use 3 times the number of available CPUs.” The Count I used:
UPTIME : 12
DOWNTIME : 24
But anyone from the Community can answer and increase my understanding : Which phases use this in upgrades, if any?
6. Parallel Phases :
Another one of random nature with Scarce details. This one talks about the number of SUM sub-phases which SAPUp canbe allowed to execute in Parallel. Again, had to refer to SAP via OSS Incident for the same.
“The phases that can run in parallel will be dependent on upgrade/update that you will be performing and there is no set way tocalculate what the optimum number would be.” Recommendation was to use default and that is what I did.
UPTIME : Leave default (Default for “Standard” mode – 3, Default for “Advanced” mode – 6)
DOWNTIME : Leave default (Default for “Standard” mode – 3, Default for “Advanced” mode – 6)