Why You Shouldn’t be Afraid of SSL Performance
I’ve often wondered why I see so many SAP systems which don’t offer SSL connections for their client applications, be they traditional SAP GUI clients, browsers running BSP or Web Dynpro applications or the IE runtime embedded into Netweaver Business Client. To me this has always seemed strange because SAP applications contain a lot of information I thought companies would be interested in keeping safe and secure: customer data, accounting information, salary information, etc.
Every time I asked the question of a Basis consultant though, I received an answer along the lines of “SSL introduces too much overhead”, or it “slows things down”.
But hang on, it’s 2013! Google searches are now all done over HTTPS by default, for 38,000 search queries per second! So how hard can it be?
Let’s try to get a handle on this by looking at the impact of switching to SSL to both network performance and server-side CPU load. I’m not going to worry about client-side CPU impact, as there are many more clients than servers, and they’re all multi-core systems running idle most of the time anyhow :-).
To keep things simple, I will only look at HTTP rather than the DIAG protocol used by SAP GUI. HTTP – through Web Dynpro and NWBC – is the future anyhow ;-).
The vast majority of SAP’s products – from old R/3 systems to HANA – use the Internet Communication Manager as the bridge between incoming HTTP traffic and the applications. This is nice because the ICM is more powerful than usually appreciated, and findings here with one system are broadly applicable to many of SAP’s products.
Let’s first recall the OSI model to understand where SSL (really, TLS) sits relative to other networking protocols we need for our HTTP communication example:
DNS, FTP, HTTP, SMTP
IP, ARP, ICMP
ATM, IEEE 802.2, IEEE 802.3
IEEE 802.11, DSL, Bluetooth
Because there are turtles all the way down, I’ll only focus on the highlighted sections above. The other layers are interesting and very important to understand (mobile app connectivity, anyone?), but they’re not overly relevant right now.
So, let’s assume we have obtained an IP address (i.e. everything up to Layer 3 is done) and want to make a simple HTTPS GET request to an SAP system via SSL. Let’s also assume we are using HTTPS from the outset by specifying a URL starting with https://, rather than starting with HTTP and letting the server upgrade the protocol to HTTPS. I think that’s reasonable for the usual portal/NWBC-based interactions with SAP systems; I don’t think many users type WebDynpro URLs directly into their browser…
This scenario results in the following flow of messages:
Click the image above for a full-size version
Messages 1, 2 and 3 establish a TCP connection at Layer 4 in the OSI model. This is the typical three-way TCP handshake which establishes a reliable TCP session, and is required regardless of whether SSL is used to encrypt communications or not. In the OSI model, higher-level protocols such as HTTP or SSL build upon the foundations established by the lower layers.
The next 4 messages shown in orange are where the real work to set up an SSL session happens. This is where the client verifies the server’s identity, and both parties independently derive a shared session key used to encrypt subsequent communication. I will leave the details of this public/private-key cryptographic “magic” to the links at the end, but a few things are important here:
- There are “only” two round-trips across the network. This isn’t so terrible even across a WAN.
- (to be precise, these are Layer 4 round-trips. If the server is using large 4096 or 8192 bit certificates, a long certificate chain, etc. then the network packets will be too big and cause the TCP congestion windows to overflow, resulting in further round-trips at Layer 3 in the stack)
- Asymmetric public/private-key encryption is used to establish this shared encryption key. Asymmetric encryption is relatively slow and CPU-intensive, so the handshake is expensive.
- Once the key is established, both parties will use fast and efficient symmetric encryption (e.g. 3DES, AES, RC4 or any other algorithm agreed upon during the handshake) to encrypt data. This is usually fast.
Finally the client can send its HTTP request message over the SSL session established earlier. The server will respond to the request across the same pipe, which is secure from eavesdropping by outsiders. Nice!
Server CPU Impact
You know, Moore’s Law is great. Every 18 months, computing capacity is expected to double.
But over the lifetime of an SAP system, it’s astonishing! Applied over 10 years, we would see computing power increase by a factor of 100! One of my current clients has been running SAP ERP since 1998, or for 15 years. That’s long enough to witness a factor 1000 improvement!!
So maybe the slow, asymmetric encryption used to derive session keys during an SSL handshake was an issue when many experienced Basis consultants set up their first systems. But today? Definitely not!
My 2008-era MacBook Pro can sign 2048bit RSA keys in 9.1 milliseconds, my 2012 MacBook Air takes 6.1 ms, and the Xeon-based systems we run various SAP apps on take 3.5 ms.
Once the SSL session is established, data travelling across it is encrypted using more efficient symmetric algorithms such as AES or RC4. Their throughput is even greater: 173 MB/s and 650 MB/s respectively on the same Xeon-based servers.
Based on these figures, the performance impact of moving to SSL should be between small and negligible. However it’s always a good idea to confirm theories by way of a test.
The exact setup is described below in a lot more detail, but here’s the short version:
- 50 parallel threads in soapUI sending GET requests to a resource in the ICM memory cache
- 200 requests per thread. No delay between requests.
- 500ms thread start-up delay (i.e. ramp-up takes 25s)
- Pre-emptive HTTP Basic authentication (i.e. credentials are sent as part of initial GET request rather than after receiving a 401 response from the server asking for authentication)
We will use three scenarios:
- Plain-text HTTP. Enough said.
- HTTPS, by changing the URL used in the test to https://
- HTTPS as above but with the “Close Connection between requests” option selected, forcing a TCP connection and SSL handshake for every request. In the graphs, I’ve called this “SSL All-New”
The main concern here is response time. If SSL under load takes a significantly long time, then this will translate into a poor experience for the user. SoapUI recorded the response times in milliseconds for each test scenario, and Excel created the following histogram:
Click the image above for a full-size version
|Throughput||Response Time (ms)|
|requests/sec||Median||95th percentile||99th percentile|
Here the cost of SSL traffic does become visible. However the overhead is small for the ‘normal’ SSL scenario at less than 10ms – not noticeable to end users.
The ‘three peaks’ distribution of the “all-new” scenario does seem a little strange and could merit further investigation into profile and ICM parameters, or even repetition using a different client tool to eliminate issues on the soapUI side.
Slightly more concerning are the approx. 2% of requests which took longer than 200ms to complete in the “all-new” scenario. Most of these took over 1 second, which will definitely be noticeable to end users. If the “all-new” scenario of 75 SSL handshakes a second sounds like something your ICM server could experience, then further tests and tuning would definitely be a good idea. I can’t begin to speculate the cause of this phenomenon.
I used a simple ‘top’ command on the server to record the CPU usage of the icman process, which is what the ICM is called at the operating system level. Here are the results, smoothed using a 15 second moving average:
Click the image above for a full-size version
Although the ‘normal’ SSL scenario consumed around a third more CPU than the same test using plain-text HTTP, it still stayed below 5% of CPU usage on a single core. The ‘worst case’ test of establishing a new connection for every request consumed around 3 times more CPU than the plain-text base case. But even then, the 15-second moving average did not go above 10% of a single CPU core.
Basically, even at ~75 SSL handshakes per second, the impact on a multi-core system would be barely noticeable.
Snapshots of the ICM workload taken during the test also back up this picture. Even in the “all-new” scenario, the number of ICM worker threads reached a peak of 64 – again nothing to worry about on a sandbox system with a maximum of 250!
Using SSL means encrypting packets between the client and the server. Intermediary proxy caches, WAN optimisers and other network infrastructure often deployed in global organisations to squeeze more out of their expensive WANs thus won’t be able to look into those packets and cache them for other users. So if you’re using SSL to encrypt icons, stylesheets, and other static, non-private resources, then there will be a performance cost as clients have to fetch them from the server every time. Depending on your topology, this may be expensive and sub-optimal. However I’m testing the SAP server-side impacts here, so I’m calling this out of scope.
Using SSL also means at least one additional round-trip to the server to perform the initial handshake. Being around ~2ms away from the server, this won’t really be noticeable. I’m guessing it also won’t be noticeable loading a Web Dynpro app with dozens of HTTP resources and hundreds of kilobytes of data. But it might make a difference to high-volume or low-latency API consumers (REST clients, anyone?), so it’s something to consider.
Server Setup – Code
On an SAP system, HTTP requests go through the ICM which acts as both a HTTP server and reverse proxy in front of the ABAP application server. The ICM also terminates SSL traffic, and this is where the pain of many SSL handshakes should be felt most acutely.
On the SAP server side, I am reusing a HTTP handler which I had previously written for another load testing exercise and which contains a very bad implementation of a Fibonacci sequence generator. It’s “very bad” on purpose in order to generate a lot of CPU load; if I knew what I was doing with ABAP, I probably could have written a terrible or even horrendous implementation.
The full code of the IF_HTTP_EXTENSION~HANDLE_REQUEST method is available here as a Gist. Suggestions towards an “improved” v2 are always welcome! 😉
In order to separate the server load of processing HTTP and SSL traffic from the load caused by actually processing request messages and generating responses, we need to enable caching of responses in the ICM memory cache. This is in-memory computing without HANA and really a cool feature! Setting this will cause the ICM to serve responses very quickly and without going back to the ABAP system and consuming CPU cycles in the process.
ICM caching is easily enabled by adding the following line into the IF_HTTP_EXTENSION~HANDLE_REQUEST method of your HTTP handler: server->response->server_cache_expire_rel( expires_rel = ‘300’).
Alternatively, it would also be possible to use some other resources which are currently cached by the ICM. The current contents of the ICM’s cache can be seen via transaction SMICM > Goto > HTTP Plug-In > Server Cache > Display. Usually it will contain image files or CSS style sheets belonging to the Web Dynpro or UI5 frameworks by default.
Server Setup – SSL
In order to test SSL performance, the SAP system needs to be given some SSL certificates via transaction STRUST and the ICM be configured with an SSL port and relevant config. If you want to test SSL with client authentication, then client certificates will also need to be issued, installed in the client’s certificate store, and their fingerprints mapped to an SAP user account using table USREXITD – all things which I won’t go into for fear of making this blog even longer…
There are a large variety of client tools which can generate HTTP traffic to load test a server. jMeter, cURL or Httperf would all do the job extremely well, but I’m using soapUI here mostly because it’ll be good enough for the job and it’s a tool I use quite a bit so there’s no learning curve.
SoapUI lets me create a test script which I can then execute using its built-in load testing functionality using multiple parallel threads. For this test, I’m aiming to measure the CPU utilisation of the SAP system’s ICM process under repeatable load: in this case, 50 parallel threads sending a new request as soon as a response to the previous request has been received.
I’ll use two scenarios here:
- “Normal” SSL. Each thread makes 10 requests to the server before exiting. (soapUI will immediately restart it to continue the test). While the thread is running though, it will reuse the TCP connection with the server according to the server’s instructions in the keep-alive header. This means it also won’t drop the SSL session, and thus perform fewer ‘expensive’ SSL handshakes.
- “All-new” SSL. The connection with the server is closed after each request, meaning a new SSL handshake needs to be performed for every request. This isn’t a realistic scenario as no well-behaved client such as web browsers would do this, but it helps in placing additional load on the SSL components on the server.
The soapUI project file is also on Github if you’re interested in repeating the tests on your own system.
A whole raft of blogs and presentations have been very useful in devising these tests. If you’re interested in SSL from a performance perspective, I would highly recommend the following as ideal primers to what quickly becomes a deep dive into the multiple layers of any IT stack:
What I like most about this is that you've applied a bit of science to the argument, including a description of how you set up the experiement. I know that everyone's network is different, but this makes it easier to demonstrate the ROI on applying SSL to all our network traffic by replicating the experiment in our own environment.
Well, you have to apply some science/data/measurements, otherwise you're just making more or less educated guesses! See also the UK.gov's excellent digital design principles, especially this one: http://www.flickr.com/photos/psd/9096907242/ Sadly, there's not always the time available...
Really nice work Sascha.
Just a side note: please don't use RC4 anymore (see http://blog.cryptographyengineering.com/2013/03/attack-of-week-rc4-is-kind-of-broken-in.html )
Thanks Uwe! 🙂
Frustratingly, turning off RC4 isn't a free lunch either as AES-based crypto suites may be vulnerable to the BEAST attack. I actually did this on one of our dev servers and found out via the SSL Labs test. No perfect solution at the moment if you need to support older browsers... :-\
True, I recently have read an article on c't on this topic.
(see http://www.heise.de/artikel-archiv/ct/2013/09/074_SSL-unter-Druck , German, €1,50)
Fighting to use SSL is my personal Dom Quixote experience. Best part is when people use analysis reports from pre 2000, or reports from SSL appliances (yeah, these are objectively done) to proof that SSL slows down the server.
Followed by: portal (or insert your specific case) is slow because of SSL, while the truth is that WDJ/A is slow and someone put all logs into debug.
An actual problem is certificate validation. IMHO it is NOT my job to explain the security guys how SSL works and that they have to set up correctly their PKI and how to distribute and install the root cert into browsers.
Everyone always blames the portal, eh? But I do agree, there are a lot of stats floating around from years, and sometimes decades ago. Google had some good figures, which I can't find right now, but they basically came out and said they couldn't tell the difference in server utilisation after switching on SSL for everything... My analysis seems to back this up, and there is always more tweaking that can be done: hardly anybody needs 4096bit keys for example, or you could prefer the RC4 algorithm, etc. etc.
The easiest way to NOT and NEVER have this kind of blog / conversation again is ... SAP needs to activate SSL be default and make it a change and apply 5 notes with custom reports attached and mandatory SolMan usage (just to really scare everyone off) to deactivate SSL.
But hey, who am I after all? People are accessing SAP ICM using strange ports like 8000 or 50000 just because a reverse proxy is not installed by default (talk about UX. That kind of access never comes to mind when SAP does design thinking or whatever is hyped right now?). Not to mention seeing people use URLs like sap/bc/gui/sap/its ...
It should be activated by default, so we don't need to have these recurring discussion.
Just investigating it over and over again probably cost more than the actual TCO.
Good idea to use SolMan as a scare-tactic 🙂
And right you are when it comes to the URL's. We never really think of using "friendly" url aliases.
If I take the official SAP notes on SSL, or on SNC, they specify a maximum overhead of 10%, which is kinda significant.
Now suppose your machine has 5 CPU's, maxxed out. add 10% to that: +0,5 CPU. However, I never heard of half a CPU, so you have to add a whole CPU, which means 20% more on CPU cost, and probably also an impact on the database licensing (which is often CPU based)
That's a frightening figure for companies.
Now I do understand that 90% of the machine load is actually on server side processing and only 10% is on the communication, so an extra 10% on the communication would only be 1% overall.
Then why on earth does SAP scare us so much by advertising a 10% overhead?!
well, I'd wager that either nobody has really updated those percentages since they were written down or they're extremely generous to always deliver positive surprises to customers 🙂
Either way, I haven't seen anything close 10% hit in neither my tests nor in literature on other companies. If it really cost Google that much to enable SSL by default on all their searches and apps, then I doubt they would have done it. After all, 10% is probably tens of thousands of servers for Google 🙂
The overhead created by SSL can easily be compensated by better ABAP programming and by not using WebDynpro. With HTML5 work can be transferred to the browser, that compensates more than what SSL adds. In case you trust your LAN, a reverse proxy can be used, taking the SSL load away from the SAP servers.
It is more frightening for companies to deal with session theft when someone takes over your cookies or logon information.
I'm not too happy with the statement: "not using webdynpro"
and if you do your business logic the correct way (performant) than your WDA is not _that_ slow
(and I'm still hopeful that there will come an HTML5 client side rendering flavour to WDA)
SSL is really applicable for business internet users when spoofing and DoS are applicable.
But for internal implementation when spoofing is easly removed with proper network configuration/devices (VLAN, Switch protection) and no DoS, why SSL is important?
The main purpose of using SSL encryption is to safeguard information from being intercepted and read by someone other than the sender and receiver. As you said this is very important on the internet, but it's not any less important inside a company's firewall.
Take wireless networking for example: Yes, most companies have encryption enabled on their WLAN like WPA or WPA2. But this only stops unauthorised access to the network by outsiders - anybody already connected to the network can (in most cases) see all packets sent across the network, including your CFO's password or their SSO2 cookies. (see Firesheep).
Although not a silver bullet, enabling SSL here will do a great amount of good in blocking such attacks.
The point I would like to raise is encryption in internal network actually just redundant security protection or failure from other security aspects. There are many security aspects which should have been implemented/considered before we hardening the security of current SAP systems.
- If the PC/Windows administrator have configured the security of Windows environment properly, it should be no users have the software to do sniffing or network analyzer.
- If the network administrator have configured the network security properly, it will be very hard for someone to do sniffing and needs special skills to do it. These are including switch protection, router, VLAN (including specific VLAN for executives), proxy, Bluecoat, internal firewall.
I remember, one of the security audit has raised the concern regarding security for RFC, SAP Gateway and Message Server. I have gave the answer as we have internal Firewall and saprouter which protect ports of the SAP servers. No direct connection from user network can reach the Gateway and Message server. So why should I protect it?
I am not against any security protection as I have implemented SSO with SAPGUI SNC (with encryption) and SPNEGO for Portal. So when no password has been sent throught the network, no SSL actually needed. For your information, SAPSSO2 shouldn't contain any password.
This article regarding performance SSL is very good. I only disagree with who think it's a must in any circumstances (default solution). As I believe in security world, the really matter is we have-to or not.
If a must, even it's slow (or slower), we must implement it. Like it or not. But when not needed, why we enforce it?
That's not end-to-end security. In case the firewall is down or mis-configured, people can access the SAP. Just because you run an open SMTP relay on a non-standard port, does not mean that it's secure. SSL gvies you the additional security to know that when you send 123, the server also receives 123 and not A.
The MYSAPSSO2 cookie does not contain any password. It contains something worse: an SSO token (not to mention JSESSIONID)
Thank you very much for your detailled analysis of current SSL performance!
You really hit it - there are not many blog out there getting a 5-star-average-rating
This blog will devinitely help us at SAP Active Global Support to convince our customers to switch on SSL (respective SNC) for every client-to-server connection - there is no excuse for ommiting that anymore - and for server-to-server connections if if the server network is not closed by other means.
"there is no excuse for ommiting that anymore"
When there is no excuse, why is it deactivated by default by SAP? What happened to: we are SAP, we understand security and take it serious?
thank you for the feedback, and happy to hear things like encouraging customers to switch to more secure setups is something you're actively doing.
good work! Actually comparing the all-new scenario with plain http is quite intresting when it comes to investigate the effects on the system when running SOAP via https. We did some load testing with SOAPUI last year and got similar variations with larger response messages (>1,5 MB) even though we used plain http.
yes, it's an odd phenomenon but I have no idea what it's caused by. For a system handling hundreds of requests per second it would be good to fix, but for our SAP system and I suspect most others, I don't think it makes a big difference.
Based on your experiences it doesn't seem to be isolated to just SSL traffic though, so maybe it's something with the underlying ICM thread handling or the communication with work processes. Just guessing though... 😐
Maybe you can share your experiences with SOAP web services on SCN some day?
Really nice article and worth reading. Though I am not a security expert, but, the article really ring bells in my mind about the using SSL on portal.
thank you, and I'm glad you found it informative. I'm not a security expert either, however if you're interested then it's not so difficult to get a decent working knowledge of the concepts of SSL in a few days from reading some blogs and watching a few videos. There are quite a few people at well-known internet firms (Google, twitter, Netflix) who work on, and write about, deploying SSL at extreme scale, so there is no shortage of good information out there!
I cannot find the SAP Note saying the CPU Utilization will increase max 10%. Could you paste me the link?
Thanks in advance