Skip to Content

This is a new Weblog, so let me start with a few words of introduction on myself and how this log came about.

I joined the SAP performance group, part of Performance, Data Management & Scalability, about a year ago. To begin, I received an exciting (and sometimes frightening) initiation on performance issues, both for the ABAP and JAVA worlds. In particular, I got introduced to benchmarking situations, where typical multi-user applications are squeezed up to the very limit of the test servers. The principal behavior of service systems under load is supposed to be well understood and mathematically described by Markovian queueing theory. Nevertheless, some curves and figures for benchmarks with JAVA applications looked peculiar enough to take a closer look. In a little study we tried to shed light on details of the behavior. Here I want to share some of the findings on effects introduced by JAVA garbage collection. 

 

Expectation

 

Before we dive into JAVA details, let’s have a quick look at application behavior expected from queuing theory. Here and in the following, I will measure system performance by the average interaction response time, which is exactly what an end user would care about.

The plot shows relative response time over the system load. The solid line is the theory prediction for a so-called M/M/4 queue (a typical queuing theory acronym: it indicates that 4 service centers work on satisfying requests where both the request inter-arrival-times as well as the request service times follow an exponential Markovian or memory-less distribution. At http://www2.uwindsor.ca/~hlynka/queue.html, you’ll find a good collection of queueing theory resources on the web).

 

image

 

>We can immediately compare this prediction to the ABAP world. The symbols represent measurements for the FI benchmark (http://www.sap.com/benchmark): we see good agreement with theory. So far so good. But now let us turn to an example for a JAVA application.

>JAVA situation

>I used an example application with specific properties. I will give details on the relevant parameters (raw service time, user think time, garbage collection duration and frequency) later on. For now let’s just say that in this example the parameters are set to clearly show the behavior which popped up in the benchmark runs mentioned above.

>Already at moderate system load, the average response time is much higher than expected. In particular, we see a factor of two increase at about 55% system load. Compared to the 10% expected from theory, this cries out for an explanation. Now we have arrived at the starting point for our study: We will see that the effect is predominantly produced by JAVA garbage collection.

> image

>Garbage collection (GC)

>Memory management in JAVA relies on garbage collection http://www-106.ibm.com/developerworks/java/library/j-jtp10283/ Simplified, a garbage collector run stops all application threads on the same virtual machine (VM) to remove discarded objects from heap memory.

>(To keep the discussion clear, we assume use of only one VM in the system and a simple ‘stop the world’ garbage collector. For the example measurements, we used a SUN VM http://java.sun.com/docs/hotspot with default garbage collection, so we had a generational garbage collector with minor and full GC runs. The overall heap size was 1 GB, at 57% system load we spent 2% of the run time in full garbage collection, and 4% in minor GC. The average full GC duration was 5.4 seconds.)

>It is obvious that the thread stop by a GC run has an impact on both the system utilization and the application response times. To say it for our example: during GC, only one out of the 4 processors of the test machine was used, and all active requests had to wait. Let’s see how big this stop time effect is for the response time. (Information on the impact of GC stop times on system utilization – and much more – is available here: http://java.sun.com/docs/hotspot/gc1.4.2 )

>GC Stop times

>The impact of the stop times can be calculated analytically, if we assume that single service requests are evenly distributed over time.We need only few parameters for application and system: the raw request service time, the user think time, the GC duration, and the GC frequency. I’ll leave out the calculation details and just plot the result for our example (raw request service time of 0.25 seconds and average user think time of 9.75 seconds).

> image

>Indeed we see a response time increase due to the stop times. But it is rather weak and far from explaining the increase in the JAVA example.

>We have to look for more. And, actually, there is more to GC than just the stop times. For example, there is a dynamic effect which I’ll call ‘request bunching’.

>GC induced request bunching

>For the calculation of the stop time effect, I used the innocent assumption that ‘single service requests are evenly distributed over time’. One could reformulate this into ‘at any time, we find the same number of requests ready to be serviced’. Sounds boring enough, but can be drastically changed by GC, as illustrated in the following sketch. We start with a constant number of pending requests per time, corresponding to a certain moderate system load. A GC run comes along. During GC, all requests wait, and new requests are simply added to the waiting list. So, when GC is finished, we end up with a much larger number of pending request – which corresponds to a much higher system load.

>Finally, this request bump is worked down. How we get back to ‘normal’? Now, that depends on application details. It is even possible to never get back but end up in eternal load oscillations – nice topic for future logs.

> image

>In addition to the pure GC stop time we now have a high load phase and a recovery period which both affect response times. To get numbers, I have modeled the JAVA example in a Monte Carlo simulation. The result is shown on the last plot. The simulation result (corrected by a measured service time increase) is in good agreement with the data. So indeed the garbage collector seems to be responsible for this behavior of our example application!

image

>What’s next?

>We have seen that JAVA garbage collection can have significant impact on response times. We think we understand that this is caused by the combined effect of stop times and request bunching. But so far that is just an interesting observation. It is time to look at consequences.

– style=’font:7.0pt “Times New Roman”‘>          >What does this mean for Java development?

– style=’font:7.0pt “Times New Roman”‘>          >What about optimizing the virtual machine, heap size, garbage collection?

– style=’font:7.0pt “Times New Roman”‘>          >Nice example. But what is the effect for my application? Is this relevant at all?

– style=’font:7.0pt “Times New Roman”‘>          >What about other serializations/contentions?

>With your help, I’ll address these questions in the future. Certainly this is not the first time anyone stumbled over these issues, so I’m looking forward to learning about your experience!

>Also, there will be more on the analytic calculations, the Monte Carlo simulation, and other topics which got short-changed. And I promise to keep it briefer next time!

To report this post you need to login first.

10 Comments

You must be Logged on to comment or reply to a post.

  1. Mark Finnern
    Hi Rudolf,
    This is an excellent first post, with introduction, interesting content, link to other resources and outlook to what is coming.
    Can’t wait. Thanks, Mark.
    (0) 
  2. Hey,

    A very interesting weblog. Can you tell which version o JVM was used in your tests ?


    Marcin

    (0) 
  3. Former Member
    Hi

    it’s nice post as a starter – I’m missing important information. JVM 1.4 provides 6 GC strategies. So you should try all of them. In your even distribution of requests it would much better to use evenly distributed GC, not a default big-bang one.

    Java 1.3 is rather history, so I’m not sure if it is worth to test anything there. Java 1.4 has nice graphical tool for watching your GC, so you can see what’s going on there and you can decide which strategy to use.

    Best regards,
    Ivan

    (0) 
    1. Former Member Post author
      Hi Ivan,

      thank you for starting this important discussion! I’m not quite sure which important information is missing. Is it the GC strategy which was chosen in the example? I mentioned that we used the default strategy, leading to 4% of the time spent in minor and 2% spent in full GCs. Now this was chosen as a starting point which features all ingredients of garbage collection. The main motivation so far was to understand the mechanisms, how GC makes the application behave like it does.

      Discussing consequences, like optimizing the GC strategy, is one of the stated aims of this blog!

      And here, you are perfectly right, exist various options for JVM1.4 besides the default strategy, like the Parallel (Throughput), Concurrent, or Incremental GC. In all cases, however, full garbage collection cannot be completely avoided. So before going into optimization, I think it is important to understand what effects are produced by full GC. That was the goal of this first post.

      Of course, avoiding GC and minimizing GC duration is the basic winning strategy in the game. Optimizing VM parameters (GC, heap size) is one ingredient, memory-conscious programming another. Saying that was the easy part. How to do it for applications in the real world is more difficult, and I do not claim to have ready-to-cook recipes. Therefore, I would very much appreciate if you could share your experience and go in a bit more detail with the GC strategy / optimization you have in mind!

      Best, Rudolf

      (0) 
  4. First of all nice blog.

    But serious question is how and what can we (2000+ SAP App developers on Java) do to minimize the garbage collections effects on the performance? I am interested more in the Architectural perspective rather than detailed coding perspective (like use StringBuffer instead of String for concatenation, init variables outside the loop etc.)

    With Netweaver platform being composed of many bricks like j2ee, db drivers, km,portal etc..it is difficult to pin point and say where and how the improvements could be done.

    (0) 
    1. Former Member Post author
      Hi Kiran,

      I agree that the garbage collection effects on the performance should be optimized! But what this means still depends a lot on the application in question, so I will not attempt the universal answer to a global question. To put things into perspective, let me emphasize that the behaviour described in the weblog is seen for the example application with its specific properties. In the followup, I plan to go in detail with the dependence of the GC effects on GC parameters and application properties. For now some more general remarks:

      –     I think it is important to be aware of the GC effects as a JAVA developer.
      –     Then the ‘detailed coding perspective’ is an immediate answer. A lot can be done by memory-conscious programming.
      –     From the architectural perspective, one should keep in mind that memory and CPU are not independent resources. For example, this affects cache design: the benefit from the cache must outweigh the cost payed to the garbage collector.

      Best, Rudolf

      (0) 
  5. Hi Rudolf,

    Firstly, for a blog all about Garbage it still makes great reading!

    In your future posts could you consider the following:
    – Include all the jvm switches you used for the test.

    – It was recommened in our go live checks from sap that the -XX:+UseParNewGC was used however reading various documenation it would seem that the -XX:+UseParallelGC could be better for typical production systems (Multiprocessor + large heap).  Could you investiate these with a multi jvm environment? The reason I am asking is that you stated that a single jvm was used on a multiprocessor machine in order to keep things simple in the initial test.  However production servers are going to have more than one server node running each with its own jvm.  Also Since the dispatcher and server each uses its own jvm, there will always be at least 2 jvm processes running in a WAS scenario…

    Therefore I hope to see a comparison of single and multiple server node scenarios, if feasable.
    Cheers, Dave.

    (0) 
    1. Former Member Post author
      Hi Dave,

      Using more than one VM is indeed a valid strategy for real situations, as in that case not all threads are hit by the garbage collection stop simultaneously. Also important is the optimization of garbage collection itself. For the SUN VM in particular, there is a variety of strategies with a number of tuneable parameters. In addition, the overall heap size is open for optimizing. So tuning the VM is a multi-parameter problem, even if you don’t factor in specific application behaviour.

      Therefore, before going into the VM optimization game, I focussed on understanding the garbage collection in a simple setting. This allows getting insights in the mechanism of the GC effects from straightforward simulation. The hope is that understanding these effects will make it easier to choose the best optimization strategy for a given application.

      As for my plans for the future: next I want to go a bit more in detail with the dependence of the effects from GC and application parameters. Then I plan to turn to optimization, both from the developer’s perspective as well as for VM tuning. How detailed I’ll be able to get with multi-node scenarios and specific tuning settings I cannot tell yet!

      Best, Rudolf

      (0) 

Leave a Reply