Analyzing performance problems on a production system
performance problems on a production system or how
to profile without a profiler
problems do not show up until your system is in production, even if you
your best to avoid this situation. When it happens your options are
limited. You cannot just install a profiler on the server, because that
slows down your system so much it would become unusable. The situation
even worse, when the problem only occurs infrequently.
what can you do today (I can promise you
that the SAP VM will improve the situation pretty soon) to figure out,
code causes your problems?
You can use thread dumps.
getting thread dumps
learned in this The amazing new heap dump feature in JDK 1.4.2_12,
there’s a way
to trigger thread dumps from the MMC. There’s another way if
you want to
automate this and get more thread dumps. On Windows you can use the sapntkill command to send the QUIT signal to your jlaunch process.
Analyzing thread dumps
you can do a simple, not very
accurate, but still surprisingly helpful profile of your application.
simple way to analyze all the data you
got with the thread dumps is to use simple standard Unix tools like
“grep” and “sort”, to get a
condensed overview of what is happening on your application server.
If you say “ but I`m on Windows”. Ok
that’s your fault ;-).No just goto http://www.cygwin.com/
and install the cygwin
tools. For the following examples you need only
“grep”, “uniq”, and
the following command :
“s*at ” std_server0.out | sort
-k2 -r | uniq -c |sort -k1 -n -r | more
will output something like this
3048 at java.lang.Object.wait(Native Method)
2526 at java.lang.Object.wait(Object.java:429)
1255 at java.lang.Thread.run(Thread.java:534)
975 at com.sapportals.wcm.util.events.EventSenderThread.run(EventSenderThread.java:75)
975 at com.sapportals.wcm.util.events.EventQueue.dequeue(EventQueue.java:68)
533 at com.sap.engine.lib.util.WaitQueue.dequeue(WaitQueue.java:238)
530 at java.security.AccessController.doPrivileged(Native Method)
516 at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
493 at EDU.oswego.cs.dl.util.concurrent.SynchronousChannel.take(SynchronousChannel.java:209)
493 at EDU.oswego.cs.dl.util.concurrent.PooledExecutor.getTask(PooledExecutor.java:707)
493 at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:727)
485 at com.sap.engine.core.thread.impl5.SingleThread.run(SingleThread.java:127)
410 at java.lang.Thread.sleep(Native Method)
405 at com.sapportals.portal.pcd.gl.PcdProxyContext.basicContextLookup(PcdProxyContext.java:1101)
you can see is the source code lines
that appeared most often in your thread dumps. You can interpret the
the first row as an indicator for the elapsed time that the
code in the second column spend in the actual source code
course some of these entries are not
interesting because the code is just waiting for something. But with
you can quickly figure out where the problem is. In this case, the last
line indicates that a lot of time is spend in the pcd. You can then go
back to your std_server file and check from where this code was
of using Unix tools you can also
write more sophisticated scripts in perl (if you want that no one else
it 😉 ), in python or ruby or any other programming language.
the way, another nice tool to visualize thread dumps
the next blog I will show you what other type of performance problems,
can be analyzed by using thread dumps.