Hibernating SAP systems for preventing them from consumption of too much resources using Sun Solaris 10
Sun Solaris 10 provides you with a powerful dynamic resource allocation. This can be used for your running SAP systems very efficiently. We will use the zone configuration as shown in the first webblog Moving installed SAP systems in Sun Solaris 10 zones between hosts and continue with the new required configuration of the zones. We will give you a business scenario for this resource controlling as well.
Changing project configuration inside the zone
In order to change the project configuration, we need to edit the respective configuration file inside the zone. It is therefore necessary to connect to the zone and edit the file /etc/project, which is the same file we used in the initial configuration. A new entry for the project G62 is inserted into both zones using the projmod command:
g62as1# projmod s K rcap.max-rss=10GB G62
The command option
K rcap.max-rss=10GB limits the available memory for the project G62 to approx. 10 GByte. For the changes to take effect it is necessary to start the Resource Capping Daemon.
g62as1# rcapadm E
This daemon and its utilities provide mechanisms for capping physical memory, enforcing caps and administrate caps during runtime. It works asynchronously, i.e. all changes take effect after a maximum interval, which can be set by the administrator.
When the daemon has been enabled the memory utilization for the project can be viewed using the command rcapstat. As an example, the following command options of rcapstat generate 10 reports using an interval of 5 seconds:
g62as1# rcapstat 5 10
The output can be interpreted as follows:
- The project G62 runs 32 processes (nproc).
- The project currently uses 3639 MByte of RAM this is called the resident set size (rss).
- G62 has a total amount of virtual memory of 71 GByte. This includes all mapped files and devices.
- The project has a memory cap at 10GB (cap).
- No memory was paged out by the rcap daemon since the last interval.
- Since we have seen how to switch on memory caps and how to monitor these caps, the next paragraph details the actual effect of using these features. Note when encountering problems with the rcap daemon, restart can be done using the command:
rcapadm D and rcapadm E
If the problems remain, the whole service can be restarted using the command:
svcadm disable rcapand
svcadm enable rcap
This restarts the service and the daemon is available again.
Using the RCAP daemon for trimming memory
By adding a new entry to the project configuration and starting the daemon we enabled the capping. Now we want to reduce the available memory during runtime and therefore we edit the project file again and change the memory cap to 2GB. The SAP system is available before trimming the memory and we can work very smoothly with the system. Rcapstat reports the following:
g62as1# rcapstat 5 20
Now we reduce the memory by issuing the following command:
g62as1# projmod -s -K rcap.max-rss=2GB G62
Now the output of another invocation of rcapstat right after the memory cap looks as follows:
g62as1# rcapstat 5 20
Obviously the first line shows the original memory cap of 10 GByte and the second line shows our new cap of only 2 GByte. After the new cap is in effect, under the hood the daemon pages out all memory which is beyond site indicated by the cap. This behaviour can be monitored using the iostat command:
g62as1# iostat xcM 5
After the effects of the new cap take affect, we try to access the SAP system again and observe that it is still available and running, but very slow². It takes some time for calling transactions. So: the SAP system is hibernated.
A logical question that comes up is if the changes to the cap can be reversed as easily. Indeed all, that has to be changed is the project setting for the memory cap from 2 GByte back to the initial 10 GByte. Monitoring this change using the rcapstat command shows, that after a short period of time, the rcap deamon increases the available amount of memory for project G62.
To validate the change, we try to access the SAP system again. It is still available and runs smoothly again. As can be underlined by the iostat command, the paging stopped and the SAP system has its full access to its initial resources. The next step will show you how to create a processor pool and assign it to a zone.
Creating a pool and assigning it to a zone
If we want to dynamically reduce the available CPUs, we first have to create a processor set inside a pool. This pool will then be assigned to one of our zones on our big server which hosts all zones. For creating the pool and assigning it to the zone we use the following commands:
host1# poolcfg -c 'create pset pset_G62AS1 (uint pset.min = 1; uint pset.max = 3)'
host1# poolcfg -c 'create pool pool_G62AS1 (string pool.scheduler="FSS")'
host1# poolcfg -c 'associate pool pool_G62AS1 (pset pset_G62AS1)'
host1# pooladm -c
host1# zonecfg -z G62AS1 set pool=pool_G62AS1
host1# zoneadm -z G62AS1 reboot
These commands have the following effects. We first create the processor set (pset) and set the lower bound to one and the upper bound to three processors. Note that after starting the zone the processor set initially contains three processors. Then we assign the Fair Share Scheduler to this processor set and assign the whole processor set to a pool. We commit our changes and assign this pool to our application server zone, G62AS1. Unfortunately the zone needs to be rebooted after the pool is assigned to it. Next figure shows the principle of resource binding for SAP systems. Either you assign the entire zone to a pool or you assign only the project to a pool.
Increasing/decreasing the number of available CPUs
When using the zone configuration as shown i.e. without a pool you can reduce the amount of available CPUs at runtime. It is not necessary to reconfigure your zone or reboot it. You can simply reduce the CPUs by issuing the following command:
prctl -n zone.cpu-shares -r -v $SHARES `pgrep -z $ZONENAME init`
$SHARES is the numbers of CPUs which should be used by the zone and
$ZONENAME is the name of your local zone. By changing the
$SHARES parameter you can increase and decrease the number of CPUs .The command needs to be issued from the global zone. It first identifies all processes assigned to the zone, then it applies the new resource control for these processes. In this case the resource control influences the zones number of shared CPUs.
When using pools for your zone you can move CPUs from one processor set (pset), i.e. the default pset to another, i.e. your specific zone, at runtime in order to increase the entire second zones computation power. The necessary command will be similar to the following one:
poolcfg -dc "transfer 2 from pset pset_default to pset_G62AS1"
The command transfers 2 processors from the processor set pset_default to processor set pset_G62AS1. This command can also be used to transfer CPUs back to the default pset.
So, how to validate the work of Solaris? Well there is a simple answer. To check if the dynamic reduction of the available amount of CPUs actually works we use a simple test. We use the SAP transaction SGEN to compile ABAP source code and produce object code. Because it takes quite a lot of time and resources to compile all ABAP programs of a SAP system, it is a good way to evaluate to what extend dynamic reallocation actually affects an instance.
We used the Business Warehouse part of the ABAP stack as test setting for the compilation run. All affected data is read from the database, is compiled and the results, i.e. the object code is stored back into the database. In the first run, we used 16 available CPUs for all zones and we recorded the time for compiling 3% of the objects (170 of 6693 objects). Forecasting reported an approx. overall time of 3 hours and 5 minutes. Then we interrupted the running job and started the generation run again, but this time we only use 2 CPUs. Again we recorded the time for compiling 3% of the objects. SAP forecasted the whole run to be finished within 4 hours and 5 minutes. Obviously running with only 2 CPUs takes about an hour longer than running with 16 CPUs.
Another aspect to investigate is the dynamic allocation and deallocation of CPUs and its impact on the run time. We used another test setting to find more about this aspect. The same generation run was started again using 1 CPU and SAP reported a forecasted overall time of 6 hours and 3 minutes for the remaining objects – again 3%, i.e. 170 of 6693 objects were already compiled. As can be seen in the following overview, the only assigned CPU (id=2) is fully used by the SAP system.
After recording the time we increased the number of CPUs up to 3 by transferring 2 CPUs from the pset_default to pset_G62AS1. After SAP updated its time forecast and after we refreshed the dialog, the remaining time was estimated to be 3 hours and 48 minutes. 330 objects had been complied at this time already. Dynamically changing the number of CPUs obviously influences the performance of the SAP system immediately. Command mpstat, which runs inside the local zone reported that two CPUs were added, as can be seen in the following output excerpt.
<<processors added: 0, 1>>
After a short time (less than 5 seconds) the SAP system utilized all CPUs for the generation run.
But how to use this in your computing center? Well the next subchapter will show you our business scenario.
The features as described in the previous paragraphs will now be presented within a case study that we implemented in our SAP hosting environment. The setting is described in the following. We are dealing with four large SAP training systems on one machine. The SAP systems are named G62 through G65. We know, that these SAP systems will be used only once per day extensively and the rest of the day the systems are unused. We configured the systems as follows:
- G62 needs about 12 GByte RAM for running smoothly.
- G63 needs 16 GByte RAM at minimum.
- G64 is a small system. We can install it with only 4 GByte RAM.
- G65 is the largest system needs about 20 GByte RAM while it is running.
- If we would install these SAP systems on one host without capping, we would have to use a host with about 52 GByte RAM. The host, we are using, only has 32 GByte however. Nevertheless we intend to install these four SAP systems on this host. This can be achieved by installing the SAP system first and then reducing its maximum available memory. After reducing the memory we can go on and install more SAP systems. All other SAP systems will stay online, but on a reduced operational mode.
When all SAP systems are installed and the memory caps for each system are set to 4 GByte, we have are left with a 16 GByte buffer. It is used, when removing the physical memory caps for the SAP systems and allowing them to use as much resources as they actually need.
As mentioned above, in our scenario we know that each SAP system will be used only once a day extensively and we know when this will be. As an example schedule we assume that from 4 am to 8 am system G65 is used by a lot of users and so we plan to allow the system to use all its desired resources by removing its caps. The SAP system will therefore run with 20 GByte RAM during this time. At 8 am we enforce the cap again and reduce the amount of physical memory for G65. All users will have enough time to end their sessions on the system, because the system is still up and online.
Then the next system, G62, will be used by a lot of users. We have to remove the cap for it now and bring this system back to full performance. This way we can use the removal and enforcement of the cap for the SAP systems to manage all SAP systems on one machine, although the overall RAM size would initially not suffice for the four systems.
Following illustration shows the usage of RAM by the SAP systems inside the zones:
So how to reduce the RAM now? We therefore use the rcap daemon. Note, that there is always an interval where the rcap daemon has to page out the excessive memory. During the switch from the 4 am configuration to the 8 am configuration the daemon has to page out approximately 16 GByte onto swap device, since G65 is capped to 4 GByte from its then current 20 GByte. This goes along with a high work load for the deamon process. The high paging load always occurs when enforcing a cap.
In this blog we showed how to use the resource capping daemon and the processor pool features for administer your large environment more efficiently.
This blog entry was made possible by the “Sun Early Experience Lab at Technische Universitaet Muenchen” and the german “SAP University Alliance Program”.