SAP, Linux, Virtualization - Itanium ... continued

Former Member · ‎02-20-2010

SAP, Linux, Virtualization - Itanium ... continued

I have started this topic in post SAP, Linux, Virtualization and - Itanium ... and I have now some new findings and results. Moreover, I have broadened referring categories and audience as it might be additionally interesting also to people interested in SAP with virtualization and moving to Linux (both from Windows and Unix).

I have decided to make some approximate tests and shortly improvised benchmarks with virtualized and bare metal SAP systems I have got. My goal was making comparison with similar bare metal systems and different platforms, not making exact results comparable with some official tests. I am interested in making further inquiries and I welcome any suggestions and recommendations both about benchmarking and this environment in general. Sometimes it is very difficult to make meaningful interpretations out of specific and formal benchmark results. General look and feel is maybe not the only important thing and it is not an exact measure, but it is certainly very important. Here is the data ...

h2. Some benchmarks

Tests were made on SAP systems that are a homogeneous copies of
production system ERP / ECC6.0 Ehp3 with 1.1T Oracle db (10.2.0.4) with
113499 chosen objects for SGEN. I have three systems with here given system IDs:

ERM: central system, installed on HVM based on host with new 4p 1.6GHz Montvale CPUs (2 cores, 2 threads each), 4 VCPUs, 6G RAM, Windows Server 2003 SP2 EE, latest updates
ERC: central system, installed on PVM based on same host with new 4p 1.6GHz Montvale CPUs, 4 VCPUS, 6G RAM, RHEL5.4 with latest patches

h3. General observations

SGEN load running time on ERC (about 8-9 hours) was similar or even slightly faster than on ERP, while on ERM it was running more than twice as longer, and similar thing was with database statistics. Long running job about closing financial period which lasts about 2.5 hours on ERP is running about 4 hours on ERM. I've got wildest difference with FS10N transaction (I was measuring minimum time needed in several attempts for three stages with same parameters: initial summary generation, first dialogue about all items, and the final report):

* System</td><td> 1st stage
</td><td> 2nd stage
</td><td> 3rd stage
*

ERP	1 second	2 seconds	21 second
ERC	1 second	7seconds	1 minute
ERM	10 seconds	45sec	30 minutes

I know that what I offer here is probably not the best way to prove ideas about performance, but that is what I've made in a shortest time, trying to gain a general overall impression about these systems. I would like to hear other people's experience or opinions, or even better some practical suggestions about these tests and systems.

h3. Few words on disk I/O

For a start, just to depict one of the most important indicators both about database performance and about virtualization - the guest disk I/O. I have intentionally avoided network I/O as it is less critical: I got quite similar performance while testing Copy/Paste in bare metal Windows and PVM smbclient performance. There are many test tools (fio, iozone, bonnie, bonnie++, lmbench, unixbench, netperf, ...) and more formal approaches, but I have used here simple Copy/Paste time in Windows or just cp or dd on linux (average measurement for few tests, with different files in order to avoid cache buffering), and it is not an exact science definitely. First test was with about 500MB or so of small files (Program Files directory), second with 800MB of 110-120KB files, third was based on much larger files (datafiles up to 1GB) - the 1.1T database was copied with dd in 9 hours on the PVM using both cp (both SAN vluns
presented on same storage, ntfs-3g and ext3) and dd similar, which was close to bare metal ERP.

* Test</td><td> ERM</td><td> ERP</td><td> VMWare</td><td> ERC*

diff. small files	5.6MB/s	30MB/s	/	/
1x 110-120KB	25MB/s	62MB/s	26.6MB/s	66Mb/s
1GB	/	53MB/s	/	35.6MB/s

I still don't have all figures, but I have learned that VMWare ESX4.0 physical drive (SAN lun as raw drive using virtual LSI scsi adapter) performs on the same storage similar as Xen HVM physical drive (also SAN lun) with larger files. Generally, these results are not giving whole picture, except that HVMs have poor I/O (pitty that SIOEMU domains are not available) compared to PVM and bare metal. It shows that sometimes disk I/O bandwidth on PVMs compared to HVMs can be as good as
on bare metal having consistent results with files 100MB or less in
size, and making better but inconsistent results with much greater
files - while latency remains poor. On forums I saw that people seem to get consistent results if
the file size is less than 100MB but it is very inconsistent if using
file size > 1GB.

h3. Some DB related results

I am showing here the basic settings and statistics for the database of each system (ERP db node has no SAP instance on it, just db node in a cluster, just to remind), with an ad hoc test by executing a simple query (with 259480 records in the table and "set timing on" in sqlplus) and taking minimum value (first execution takes usually more time mostly due to parsing and buffer hit). ERP is probably tuned better, but other systems are sized and set according to resources at least with most important parameters (including clear situation in ST02 for abap, but I have omitted to correct db cache for ERC).

* System</td><td> ERP</td><td> ERM</td><td> ERC</td></tr><tr><td> RAM</td><td> 24G</td><td> 6G </td><td> 6G </td></tr><tr><td> sga_max_size</td><td> 12G</td><td> 3.5G</td><td> 4G </td></tr><tr><td> pga_aggreagate_size</td><td> 2G</td><td> 800M</td><td> 850M</td></tr><tr><td> shared_pool_size</td><td> 1.2G</td><td> 832M</td><td> 400M </td></tr><tr><td> PHYS_MEMSIZE (abap)</td><td> 14G</td><td> 3G</td><td> 2.5G </td></tr><tr><td> db cache (buffer) </td><td> 6G</td><td> 2G</td><td> 1.1G </td></tr><tr><td> data buffer quality </td><td> 99.5%</td><td> 91.3%</td><td> 96.6%</td></tr><tr><td> DD cache quality </td><td> 95.9%</td><td> 98.8% </td><td> 90.7% </td></tr><tr><td> select count() from dba_objects

3.35sec

2.35sec

1.58sec

These values depend on system load and user activity, so they can show ambiguous results without proper conditions (running long enough with similar load at least). I have made ad hoc tests using SE30 transaction's tips&tricks (one of
my favorites) which show slightly better results on HVMs in some cases. There are some template queries there and snippets which can be executed and measured in milliseconds (each having two variants, one on the left side, another on the right side of the screen). I have executed some of them during regular system load, each few times (up to 5-10 times) and took the lowest (minimum) value on each system (which is also very close to average):

* SE30: (microseconds, lowest value per each system)

<table border="1" width="428" height="322" align="center"><tbody><tr align="center"><td> Test</td><td colspan="2"> ERP</td><td colspan="2">ERM
</td><td colspan="2"> ERC*

BIS

	similar	main	similar	main	similar	main	similar	main
SQLinterface
Select aggregates	8383	434	7480	457	6789	447	6937	436
Select with view	167610	19926	342670	16090	190347	12894	8825	966
using subqueries	638	644	583	576	487	489	615	545
Internal tables
Comparing int. tables	461	43	306	25	407	27	411	27
Context
Supply/Demand vs. select	466	464	502	511	487	489	9781	703
ST03N
Type	total	db	total	db	total	db	/	/
DIALOG	323	89	1253	652	985	392
BACKGROUND	13307	5070	25626	23642	21115	4093

I have also given data for a BI / SEM IDES system BIS (based on NW7.0 ABAP+Java stack), based on PVM with same virtual resources as ERC - only thing is that db is 0.13T in size (while our production BI system with EP and Java stack is also about 1T in size).

Finally, ST03N statistics are useful for system tuning, but very elusive about benchmarking because they also
depend heavily on user activity which can vary much (and I still don't
have good data - I've used here a weekend night with similar low number
of users, but I am not satisfied with that).

h1. Conclusions

PVM guests have better performance in general (as expected), but database should not be on a virtual machine if maximum performance is needed without some scaling out (otherwise, it depends, e.g. using Oracle RAC). PVM guests booting takes less than a minute, while Windows on HVM takes minutes to become available and accessible through RDP - there many additional benefits beside performance with PVM (high availability before other and licensing: RHEL advanced platform support subscription allows unlimited number of guests per host with unlimited number of sockets in a fraction of price compared to other vendors) - therefore, migration path from Windows to Linux on Itanium is highly recommended and more justified than any other. I would need to make more thorough testing for detailed conclusions (comparison with HP-UX above all), but I am confident now that this platform (Xen on RHEL5.4) is very stable and giving predictable and usable performance results, and more - it is supported by SAP (and somewhat Oracle) and HP as a commercial platform with good support. There aren't many licensing benefits coming from virtualization except eliminating Windows or virtualization licenses, while only HP-UX offers some level of licensing consolidation with VSE and Capacity-On-Demand supported by Oracle (dynamical CPU resource usage, but this makes some sense only on big Supredomes with large number of CPUs with occasional extremely high peaks).

Few thoughts about Itanium future: Intel had almost equal income share from ia64 as RISC competition up to now, and this trend will be followed in the future (having Itaniums now that will use same motherboards chipsets as new Xeons). Software vendors and other hardware vendors might have different perspective. There aren't any other signs I can follow from Intel and HP, and I wait for latest TPC-C / SAP sd2tier w/SAPS and TPC-H benchmarks with BladeSystem infrastructure (not many at the moment) which follows comparable good price/performance ratio. I also expect more official benchmark results with virtualization (VMWare at least if not Xen) and scaling out (Oracle RAC / VM). Software vendors give different and sometimes confusing signals (SAP is very clear, Itanium is to stay - no change ahead so far), while other big hardware vendors don't offer real alternatives AFAIK. If it's about critical business environment and not about best price/performance ratio or HPC, there is no good reason to change CPU architecture to other than Itanium. If it is about consolidation and virtualization while keeping existing hardware architecture and critical business beside, there are many options, but all come to expensive HP-UX (justified for those with highest demands), or Linux with Xen (Red Hat or maybe Suse). Otherwise, low risk, flexibility and Windows together, even on other architecture - can not be justified.

h1. P2V and V2P migration

Physical to virtual migration (P2V) using SAN storage and this environment is more or less very simple - no conversion tools are necessary if using vluns as physical (raw) drive. Using such this approach has the benefit not just in performance terms, it is also good if you need a disaster recovery scenario involving moving to a physical, previously prepared machine to which these can be easily presented. P2V procedure is based on Microsoft kb 314082 article, and it is mostly about MergeIDE.reg (as found in that article) which should be imported into physical instance before migration (otherwise, BSoD is inevitable because IDE drivers are not present). I have also copied (just in case) system32driverspciidex.sys and windowsinfmshdc.inf from a working HVM Windows guest to the physical instance - and that's it, piece of cake. Usual import of NVRAM with nvrboot might be needed, too. In V2P migration it is similar, and if some specific drivers other than EFI boot driver or those in the HP PSP
are needed, %SystemRoot%Setupapi.log on a target machine should be
investigated.

h1. Host clone

I had to mention this - as part of disaster recovery scenario, testing or deployment, using shared SAN storage makes life much easier. Not that only guest systems can be easily deployed (including usual methods with sysprep on Windows) and cloned from template installations - RHEL server instances as hosts can be cloned easily, too. After making a copy (snapshot, snapclone, mirror) of the system disk and presenting it to a new host, it is only necessary to change (I am using precauciosly boot from installation CD virtual media with linux mpath rescue and /mnt/sysimage/... but that is not necessary):

host name in: /etc/sysconfig/network
IP address in: /etc/sysconfig/network-scripts/ifcfg-eth0

That are the only host-dependant settings (if kept like that), but one can make specific hosts where additional changes might be needed.

h1. Additional information

Here are some important facts I have omitted to write in the previous post:

VERY IMPORTANT: just as with MSCS or hosts with boot from SAN, it is necessary to set disk timeout and IDE behavior (Xen uses exclusively IDE on HVM):

HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E96A-E325-11CE-BFC1-08002BE10318} 001
HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E96A-E325-11CE-BFC1-08002BE10318} 002
* ResetErrorCountersOnSuccess = 1* (DWORD)

... and also, the most important thing is to set disk timeout:

HKLMSystemCurrentControlSetServicesDisk*TimeOutValue = 78 (DWORD, 120 hex) Without this, Windows guests might become unstable from time to time (other settings here are also important). </li></ul><ul><li>for old HP EVA storages (with A/P (active/passive) paths, e.g. EVA 5000 with firmware older than VCS4.00), it is only necessary to have defined for each multipath (instead of other parameters for multipath example for A/P storages I have given beside wwid and alias): prio_callout "mpath_prio_hp_sw /dev/%n"
 path_checker hp_sw
 rr_weight priorities*

... and as defaults section in multipath.conf is set for newer A/A (active/active) paths, it is necessary to use old paths with these parameters explicitely set. It is also reommended to use preferred paths on old storages.

some general recommendations about moving guests manually (from host to host, migration is unsupported on Itanium - as told, but this doesn't mean it is not possible to have it, at least manually / using scripts):

- before making changes to storage vlun, flushing buffer cache and invalidation on the host is needed to make host aware of them (e.g. putting virtual machine down on a host, doing restore on the storage or making changes on a different remote host - then after that comes this cache dropping before bringing the virtual machine up again - otherwise, data corruption and inconsistencies are expected):

sync echo 1 > /proc/sys/vm/drop_caches

This is not needed if the guest is being brought up for the first time since the last reboot of that host or if using GFS / Cluster Suite. If using advanced disk cachingon Windows guests, it would be also good to do disk cache flushing on the guest before making storage snapshot (beside making applications prepared for the backup) with a useful tool:

Sysinternals - sync- about virtual EFI issue: if you expect many interventions in guest EFI during boot, it is recommended to use dom0_max_vcpus=4 in elilo.conf and (dom0-cpus 4) and in /etc/xen/xend-config.sxp, and usually about 10% of the physcial memory given to Dom0 (and even 1G more, depending on number of active guest systems, if swapping is evident - e.g. shown by free) from my experience - for example, for 32G RAM would be used (with (dom0-min-mem 3200) in xend-config.sxp to avoid memory ballooning):

 append="dom0_max_vcpus=4 dom0_mem=3200M -- quiet rhgb"

... otherwise, 2G of RAM for Dom0 is sufficient (and even smaller number of vcpus, depending on number of HVM guests and virt-manager responsiveness - they spend resources for I/O on Dom0, which can be addressed by crediting additionally - all these settings do not change performance of DomUs significantly which remains completely stable).

- similar to OS kernel scheduler, Xen schedules cpu resources to guests based on their relative weight (default 256 for all domains - domain with twice bigger weight gets twice more resources), for example (boosting Dom0):

 xm sched-credit -d Domain-0 -w 1024

- there is a way to partition available CPUs and avoid unnecessary context switching between different physical CPUs, by setting cpus parameter in guest configuration file (in /etc/xen). For instance, BL870c has 4 CPUs, each with two cores, each core with two threads (can be set in EFI shell with cpuconfig threads on) - these are numbered by Xen as cpus 0-15 (first cpus in this list are used for Dom0) - so, if guest is based only on second CPU, it should include cpus="4-7" in it's configuration file.

one also important EFI setting is about MPS optimization (performance optimization with maximum PCIe payload) which is only available on HP-UX, OpenVMS and Linux (not supported on Windows):

ioconfig mps_optimize on 

... and in EFI shell also, there is a* drvcfg -s *command which should be used in some environments (depending on the OS, though I didn't notice important change with this setting); or, using drivers command to extract driverNr deviceNr pairs, issue manually (for a pair 27 2B here):

drvcfg -s 27 2B

... and then (similar to EBSU setup), through given FC driver menu driven setup (options, example: 4, 6,0, 11, 12, set OS mode, back to 4, 5, 0, 11), reboot server with RS command from MP.

mandatory acpiconfig setting for cell based (NUMA) rxNN20 and similar Integrity servers, is single-pci-domain, for newer (rxNN40 and newer) it is default (instead of windows):

acpiconfig default 

while in HVM guest (virtual) EFI environment I have used these settings (first disables DEP which sometimes caused serious problems on MSCS bare metal machines, and /novesa showed useful on some newer machines not just during Windows setup) in NVRAM for OsLoadOptions - an example how to set this in EFI shell (before making guest bootable) is:

map -r
fs0:
cd MSutil
nvrboot ... and then I option (import), choosing 1 for the boot option, 2 forthe parameter line:

OsLoadOptions=/noexecute=alwaysoff /redirect /novesa 

about installation - I have omitted to mention that during host or guest installation (which is generally needed only once per site / per template), default Virtualization and Development software groups are sufficient, though after a thorougher browsing through Oracle and SAP prerequisites I have found that some additional packages were missing: sysstat, compat-openldap, libXp, 'Legacy Software Development', and a rpm in saplocales_rhel5_ ia64_version-2.zip found in Note 1048303

h2. To Do ...

I find very few interesting things about this subject that are left somewhat unkown or not tested, and they are mostly not important. Above all, I am looking for features that we use every day with SAP on Windows which should be mapped and migrated to RHEL - AFAIK, there isn't one single such feature that is not available on RHEL, but there are important features (like virtualization with IA64) on Windows which are not available ...

I haven't checked available HA scenarios - it is probably a variant of Netweaver failover, and I would prefer to see behaviour of cluster on guest systems
SSO, SNC login with Front End clients on Windows having RHEL server ? Yes, in a way, having SPNego and Kerberos instead of SNC - and this goes also for Java AS / Enterprise Portal and ABAP. I know that it is possible to have SPNego with Kerberos (harder option in security terms, safer than SNC in general even on Windows GUI, fully supported by Microsoft, SAP and RHEL), but this should be tested. Great SDN blogs and SAP Notes available about this:

SPNego - Wiki
Configuring and troubleshooting SPNego -- Part 2
Configuring SPNego with ABAP datasource -- Part 2
Note 968191 - SPNego: Central Note
Note 994791 – Wizard-based SPNego configuration
Note 1082560 - SAP AS Java can not start after running SPNego wizard
Using Logon Tickets on Microsoft Based Web Applications

I am currently using SSO with existing Windows domain authentication via SNC for Windows Front End, and Smart Card authentication via PKI and client certificate authentication for WebGUI and Enterprise Portal. Using SPNego authentication supports fully all these scenarios on RHEL server environment. In the most simple form, if SAP user account is same as on Windows ADS it passes through SSO, otherwize - username/password is available, or special handling (mapping) of accounts.

comparing distributed against single server systems (database not virtualized), and other interesting combinations which I didn't cover here (including performance scaling with number of (v)cpus), moving more of the existing physical systems to virtual, testing backup and restore scenarios - experience with Data Protector 6.11 on RHEL is excellent: everything worked from the start (I had difficult times on Windows with SAP integration and other things in the past), and there are really many possibilities (at the moment I do guest backup using physical image backup on the host through SAN on tapes or ZDB - works very stable, fast and reliable).

h1. References

SAP on Linux

Setting the TimeOutValue registry for Windows 2000 or 2003 Technet - disk timeout
 Oracle Metalink note ID 563608.1: Oracle SLES / Xen is supported
 XEN - SIOEMU Domain
 SAP Note 1122387 – Linux: Supported Virtualization technologies with SAP
 SAP Note 171356 – Virtualization on Linux: Essential information
 SAP Note 1400911 - Linux: SAP on KVM - Kernel-based Virtual Machine
 Virtualbox - MergeIDE
 Microsoft kb article about booting IDE device (P2V): kb314982
 HTTP-based Cross-Platform Authentication via the Negotiate Protocol
 SSO with logon tickets
 about SSO on help.sap.com
 SAP, Itanium, Linux and virtualization ...
 IDG press release
 Wikipedia - Tukwila
 Itanium

TPC-H
TPC-C

SAP, Linux, Virtualization and - Itanium ... continued

SAP, Linux, Virtualization - Itanium ... continued <br />

Are you there, SAP? It's me, Jelena

Integration Point of MM-FI-SD in SAP ERP

SAP Project System - A ready Reference ( Part 1 )