Load Testing – No excuses just many compromises!
So you’ve decided to roll-out SAP ‘xyz’ solution, and it’s going to add what you suspect to be a decent amount of load, requiring you to purchase and tune hardware and software to support it. But do we have time to load test the solution?
Are you kidding me?
This is like running a functional test for the first time with one user and saying – yes – that all works – put it in production (which also happens from time to time unfortunately).
The remainder of this blog is to help those who need to think about doing some level of load testing without calling in the load test professionals. e.g. No excuses just compromises.
Commercially Focused Load Testing
So what does it take to do Load Testing. Well it depends…Not an “it depends” and I won’t give you an answer but a “it depends” because load testing can be a make or break for some projects. e.g. If you are rolling out a customer website like Apple, Amazon, Google would do; then you are going to have to do a lot of work to do load testing properly. And while I can give you philosophical answers to what to do, get an expert in load testing (but keep on reading anyway if you like)!
So what else does it depend on, well obviously, how much load is expected to be seen (no use doing load testing if it’s 2-3 users concurrently with a few background jobs – this would just be called testing it properly if this was an issue). How about visibility, risk to business (or alternatively efficiency gains if testing quick response times), how quick you can react to issues if something does go wrong, etc.
So if you think load testing would be worthwhile, and your overall risk profile isn’t too great plus you find yourself debating whether to do load testing because of the cost; then I say – let’s do it but compromise.
Things you’ll need:
- A production like system with as near as final software as possible (sized similar, with as close to production batch jobs, configuration as possible). Plus a good proven backup of this system before you run load tests!
- A good idea of the # of different types of users that will be hitting the system concurrently – and in extreme cases a more detailed hour by hour breakdown of expected load (e.g. Payroll, BI and Bill Run loads running concurrently at 1am can be fun)
- 1 or maybe more test machines located near the servers but which covers the end to end systems being performance tested (e.g. If you are going via a reverse proxy and you are concerned about where the bottleneck may be, then you’ll need to test beyond the reverse proxy). Go with 2000 real users per test machine as a guide
- A good set of real scenarios that cover off the various components and technologies, and avoid caching and locking. e.g. Managers looking at the Universal Worklist, Employees raising a trip request, displaying a BI report, searching on a document, posting a document, etc. Note – Avoid large downloads like files as this will limit your ability to emulate load since it effectively soaks up the network on the test machines.*
- A download of jMeter (jmeter.apache.org) with the additional plug ins (https://code.google.com/p/jmeter-plugins/) to give us some fancy graphs and emulators. Note – If you can afford HP LoadRunner, then it’s easier to learn (I always go to type Lode Runner for those raised in the 80’s)
- A coder/tester who understands how HTML works (I wish this was any coder now days) and how to use developer tools within browsers (ditto)!
- Basis and infrastructure volunteers to monitor your load testing. And just in case, you may need a couple of CAB’s approved for comms as you just never know what will happen to the business!
* My main point here is that emulating real production load is futile in most cases, so sure – do some specific performance testing of big custom builds, or get a bunch of users hitting the system at once for application concurrency issues, but let’s just focus load testing on ensuring we have the right infrastructure, basis and application parameters set-up to give your stakeholders confidence that if anything goes wrong with load and go-live – that it’s likely to be a specific application issue.
So with all this, you can complete the following:
- A Load Testing Plan (to keep stakeholders happy) – but this is a quick document describing the things we’ll need – and should be fit for purpose – I had SAP offering to review my load testing document over 5 days for something that took a day to write!
- Get your developer going through some training material to learn jMeter!
The key is to breakdown the different user types, and assign them relevant pauses between “clicks” (according to an old SAP Note 651581, SAP states this is typically ~7 seconds for “well experienced” users and 20 seconds for less experienced ones – not sure they have seen my mother in law in front of an iPad but hey). Then potentially a different wait time between end to end transactions. Put this into your test plan, and now we’re good to go.
I’m a big fan of jMeter. It’s got a lot of power and mostly intuitive. Maybe not quite as user friendly as Loadrunner. There’s quite a bit already out there in SCN about jMeter, but I’ll just add some cool things I noticed and give some lessons lessons from the SAP based testing I did. Topics I’ll cover:
- Threadgroups with jMeter Plug-Ins and Documentation
- Different User Threadgroups
- Arghh – Web Dynpro Dynamic Parameters
- Recording Scripts Through Built-In Proxy
- Random Timer
- NWBC and SAPGUI
- Load Versus Stress Testing
- HTML Request/Response Parameters to Care About
- What Happened at “n” Minutes???
- Search and Replace
- Chrome Tools and StackEdit
- Ah – So Pretty!
Threadgroups with jMeter Plug-Ins and Documentation
I use threadgroups as a way of grouping the different types of users. Using the jMeter plug-ins, you can use a stepping threadgroup which not only makes it easy to progressively increase the # of users to the maximum number over a period of time, then maintain that number of a little longer (to soak the system a little) then gradually roll the individual threads off; but it also gives you a graph you can cut and paste into your test plan like so:
Different User Threadgroups
As I hinted in the previous point; threadgroups are great to separate the user groups. Example below which makes reporting easy to split by group:
If you capture all the metrics to a file, you can analyse the results any time. You can try to display a pretty graph while running the scripts to monitor the progress (e.g. Response times) but if you were running for an extended time or looking at a substantially large test scenario, you’ll probably chew up all memory in jMeter before you hit performance issues.
Arghh – Web Dynpro Dynamic Parameters
This is the most ugly part of the work I did. I discovered that Web Dynpro (ABAP) is really bizarre when it comes to writing HTML scripts. In short, unless you do something about it, Web Dynpro parameters are dynamic and incredibly difficult to program for. The solution? Very well hidden (but discoverable in Service Markeplace if you look for automated testing) parameter called sap-wd-stableids. Luckily due to the community, I found this which helped describe it a little further:
This all said, for some scenarios, I still couldn’t take a script I recorded in one system and use it in another system without re-recording the script. I did try and change all references (refer to search and replace below), but for some Web Dynpro screens, there was more to it than just this parameter id. As Abdul states in his reference above; it’s not guaranteed to work in all cases.
Recording Scripts Through Built-In Proxy
Okay – This is a bit 101 for jMeter users, but just pointing out to you that you can make a proxy in jMeter which you then point your browser at, and run through the required scripts. This then effectively builds you a shell of a script to start to adapt for your testing. Just pointing out that you don’t need to hand build these very low level scripts.One lesson learned – put a filter on the recorder as you don’t need to download jpg’s, CSS files, etc within load testing.
Obvious, but just to be clear, use the Gaussian Random Timer between “clicks” and not HTML calls to provide a random response time. Users responding every 7 seconds exactly isn’t realistic. I would suggest a random delay between 2 and 19 seconds would be more realistic – hence this timer should be used.
Simple lesson here – record without SSL, and switch to SSL for the load test. And if you are afraid to use SSL, refer to this blog:
NWBC for Desktop and SAPGUI
Yep – These are thick clients and won’t work with jMeter. So what do we do? We compromise. Use NWBC for HTML and WebGUI/ITS. It will add more load to SAP than expected but that’s fine – it’s load testing and our tests are really not going for perfection.
Getting access to write your load tests in any system just isn’t going to happen, so building your scripts in a way to be moved is another obvious step. 2 of the easier areas to do this:
1. Using user defined variables in the Test Plan (e.g. Time between clicks, Time between transactions) and;
2. Using the config element HTTP Request Defaults which allows you to enter details such as server name port name, HTTP/HTTPS, etc; and have it applied to all requests (called samples) that have blank values for this information.
Load versus Stress Testing
Load test – Testing of expected volume in production.
Stress test – Testing beyond expected volume to the point of failure where failure could simply be an unacceptable response time.
For my scenario, I had a limited number of users in the SAP system, so rather than creating thousands of users, I changed wait times between clicks and transactions to much less – which effectively quadrupled the overall load with a simple tweak of a couple of parameters. It does not give you a perfect idea of how many users you can handle, but it does let you identify bottlenecks in your system. e.g. I compromised to get the job done.
HTML Request/Response Parameters to Care About
When scripting SAP web transactions, there are quite a few parameters to care about that you have to retrieve from a response from SAP and insert into the next request. The most important ones just for reference are (I’ve also included very technical jMeter information which you can ignore unless you need to do this yourself and want some guidance):
XSRF for NWBC: regex used name=”sap-login-XSRF” value=”(.+?)”
- sap-nwbc-context for NWBC: regex used name=’sap-nwbc-context’ value='(.+?)’
- sap-ext-sid for NWBC: regex used sap\\x2dext\\x2dsid\\x3d(.+?)\\x3f followed by a BSF PostProcessor that uses the following: var test = vars.get(“sap-ext-sid”);vars.put(“sap-ext-sid”,test.replace(“\\x2d”, “-“));
- sap-wd-secure-id for Web Dynpro: regex used name=”sap-wd-secure-id” value=”(.+?)”
- sap-contextid for Web Dynpro: regex used sap-contextid=(.+?NEW)
Note – To work out where this happens, you use “View Results in Tree” which captures the request/responses when you run the test so you can reverse engineer and debug the parameter mappings.
What’s happened at n minutes???
So in the middle of my load test, I had a massive spike on response time, and then everything was not going so well. What happened? Well, the point of this point is to not mention the monitoring you should have in place on the server, but to mention that you should also capture relevant statistics via perfmon.exe on your test machine (or equivalent if not Windows). Upon investigation, it looks like a software roll-out occurred at this point, impacting my machine, then the network was impacted from rolling out of software to nearby machines.
Monitoring Results in VMWare
So a week after I finished testing, I asked the infrastructure guys to send through some graphs. Guess what, VMWare averages/sums up details to a granularity of an hour after a week. I can tell you that reporting a graph with a flat line in load testing, is not useful.
Search and Replace
A quick tip about jMeter scripts. They are just text XML files. Hence search and replace in Wordpad or similar is the easiest way to change a hostname throughout.
Chrome Tools and StackEdit
This was a nice surprise – I used StackEdit – a Chrome app add-on during this work and discovered that the cryptic responses you get back from Web Dynpro render nicely in StackEdit so you can see what is happening. To explain, Web Dynpro sends UI commands back to the web page in a very compressed format rather than complete web pages but this makes it visible.
I also made use of some diff tools in Chrome which was handy but not good enough to promote here – still looking for a good one actually.
Ah – So pretty!
Of course, load and stress testing is very technical, and you have stakeholders to please. You need pretty graphs. You’ll obviously get lots of good graphs from Solution Manager or Infrastructure monitoring, but what Stakeholders will most care about is response times for different scenarios. So provided you name the samples well, you can use the Response Times Over Time graph for the selected scenarios you care about (e.g. NWBC login, Refresh Universal Worklist, etc); it makes great graphs for reporting that you can clearly explain.
So what happened for us…
We found we had not changed the default maximum sessions which meant it died at around 200 users the first time we ran the test. We didn’t configure the number of background and update tasks to handle the load expected. We noticed one batch job killing the system (partly due to the previous set-up).
And this was in a system that was meant to be identical to production!
But we applied all the changes into this system and then production and re-ran the tests and all was good in the world (at least for our solution being load tested)! (Compromised) Load and Stress Testing FTW!
"Commercial" load testing can be a very big problem for projects. As the software + load testers can cost a significant amount of money, they are added during QA phase. IMHO this is too late. And there is not excuse for Devs to include load tests in their daily tasks, using jMeter.
Even old computers have enough power to run a jMeter test that simulates 10 to 20 users. That's close to expected real life experience (how many times do you have 20 users simultaneously accessing an app?) and can already show some interesting behavior that the Dev may not have taken into consideration. Finding these errors during early coding stages is priceless.
Finding these later by a professional load tester using a tool like HP Load Runner is too late and costly.
Problem is that companies understand the need for load testing during QA, but during development phase? Dev gets his money to code, not to test ...
I imagine you hit the requirement of load testing alot. But even in straight ERP implementations, I see System integrators saying, there's no need to do load testing; and that's where a late but necessary QA approach to load testing is a good idea.
To a degree, the development architect if they are any good, should be able to identify the risk of load testing and how early you do testing. e.g. Sometimes a real test is not possible because data hasn't been loaded to make the test realistic (e.g. BI loads on an empty system work really fast!). But you raise a good point for custom development to consider running their own mini-load tests during development (but I doubt many would do this). Even personally, I would tend to rely on experience, and understanding my bottlenecks and design around any big issues to minimise risk as no one budgets any time to load test till...you guessed it, QA.
Personally, I integrated load tests into my (local) continuous integration framework. jMeter test case with 20 users, nothing that breaks my laptop and for sure shows bottlenecks or coding errors. If now only every developer would do the same ... should show that most SAPS sizing values are made for bad code 😆
A problem I confront quite often is: jMeter, soapui, ab, selenium, etc tests are not accepted by the professional load testers. 😥 All the work invested in having load tests for nothing, just because the code entered QA. Can I blame software vendor partnerships, specially SAP <-> HP 😈
I definitely concur with the non-HP tool thoughts from SAP; as I had SAP representatives skeptical I could do what I did when in reality, there is little difference between LoadRunner and jMeter for what I needed it to do (except ease of use initially) and people's perceptions. The great thing about jMeter is you can reuse your (albeit brittle) load tests ongoing, while usually you only purchase a period of time with HP Loadrunner tools.
Awesome Matt! I think you've totally hit the nail on the head here and have touched on quite a few things which I too feel need to receive more attention in the SAP space, namely:
The great being the enemy of the good. If projects realise they need load/perf testing, often the first instinct is to go for formal perf testing by specialists using expensive, separately-licensed tools. Sadly those people are often specialists at running tests and raising defects, but never at actually suggesting or implementing fixes, which means you'll better have your developers on hand too.
Because testers and their setups are complex and expensive, it's always easier to postpone and do perf testing sometime around UAT, i.e. too late. IMHO it's much better do make some compromises and empower your developers to run their own tests using open source (and thus widely-used!) tools early on, so that they can fix issues themselves.
I'm happy that both Tobias Hofmann and even ThoughWorks through their latest Tech Radar share some of the same sentiments! They point out the value of Guerilla Testing (i.e. not "the test team" raising defects for "the dev team" to fix), state that projects should treat performance testing as a first-class artefact, and reiterate the value of nimble, focused and free testing tools built by developers for developers over massively overtooled enterprise suites. Case closed then 🙂
Thank you for the blog! Sounds like someone needs to do a Mastering SAP/TechEd presentation on this topic? 😉
P.S. Thanks also for the link!
Thanks Sascha. While I would always do some load testing of anything I have a concern with; I would never want to dedicate a session at Mastering or TechEd unless I could have some fun by trying to bring down a web site or similar on stage 😀 (and focus on the development aspects behind the failure). e.g. I don't want to get the moniker, "Load Testing specialist" and end up with people thinking I have the traits you just mentioned 😛 .
It is nice to see this activity get some attention on SDN. I have been managing SAP performance testing for large SAP installations and have a couple observations to share:
Define the testing - think with the end in mind - at the end of the test you need to be able to explain the results without boring your audience with technical jargon. It should be made clear up front that this testing is not some kind of silver bullet that is going to solve all of your performance issues. Particularly in large systems, there are many other contributing factors to the performance other than the functionality under test.
The results should be communicated to system admin and development teams. They will determine risk factors, and actions that need to be put in a control plan.
I have found that this also dictates the need to ensure that testing is offered as a service to projects; thus, making it part of the tollgates for the architect to report out on. This also allows the ability to plan testing based off of the functionality even being available for testing in the system based off the project schedule. You can't test what doesn't function!
The next point I have is also around scope; If your applications are using SAPGUI, your tests are limited to a very small number of users (60-75) per load generator. Of course this depends on your hardware. The point is, this should be considered when evaluating software purchases for your test tool licenses.
Thanks for the good input Brad - All good lessons learned that should be considered with above (especially working with what actually works for projects). Note - For SAPGUI load tests for smaller places (as I mentioned above) I would just use WebGUI/ITS which is not perfect, but sufficient for providing transaction specific load and doesn't come with additional software requirements/testing system capacity limitations, and can be easily incorporated into jMeter scripts.