Skip to Content

On the The wonderful world of java threads (I) of this blog, we discussed about threads and tools to analyze them. Now we are going to see some real situations which will help us to understand how to analyze a thread dump. We will work with the JDK by Sun and IBM, so sometimes the thread dumps will be contained in the std_server.out files (in case of Sun), and other times the threads will be located at javacore…txt files (IBM). Anyway, no matter which JDK we use, the Thread Dump Viewer will be able to display both of them.

Ok. Our system is hanging and we have created a series of thread dumps. This is the starting point. Now we can use, for example, the Solution Manager Diagnostics or the Thread Dump Viewer to analyze them. As we said on the previous blog, we are going to discuss mainly the Thread Dump Viewer.

Note! For further information on how to use TDV, please check the documentation by Angel Petkov.

 

Example 1:

After loading the thread dump (this time is a javacore file) we can see the following at the TDV:

Ok, it is an easy one. Although we have only one thread dump (instead of the recommended series), the first thing we can see is a popup indicating that there is a deadlock in this thread. The deadlock is probably the reason why there are no free application threads (we can see this information on the upper-right corner box). When there are no more free application threads on the server node, the server is no longer able to process incoming requests!

If we accept the message and search for threads of type application, we will see that all of them are waiting for connection to the database and are locked by Thread-40.

By making click on Thread-40, we will find the reason why the deadlock situation is occurring:

This deadlock is occurring at OpenCMS implementation on Oracle. The oracleOci thread is using java.sql.DriverManager.getDrivers(), and Thread-40 is using com.opencms.dbpool.CmsPool.getConnection(). A deadlock is always a code problem, so here the OpenCMS developer should check his/her code and fix the problem.

 

Example 2:

For this example I’ve loaded a series of thread dumps (created with IBM JDK) to the Thread Dump Viewer. After loading, we can see some interesting things. At the “threads” box, we see that there are 40 application threads, but none of them are free (same case as the previous example, but there is no deadlock here…) At the “filters” box, I have selected “state:CW” (this is more or less the “waiting for monitor entry” status for Sun JDK) and thread type: application. This will show only application threads with the CW condition. Something like this:

 

Ok, there are quite a lot of application threads waiting for something here, and there are no free threads… In fact, most of them are Java Conector threads. We could try to increase the maximum number of application threads (as it seems that the current number is not enough). To do that, we have to use the Configtool:

The parameters we need to adjust are MaxThreadCount, MinThreadCount and InitialThreadCount. In case of a Portal, SAP suggests 150 for MaxThreadCount. In case we have an XI system, this value should be 350 or above. But of course, these values can be changed if needed.

Sometimes this is enough for the system to keep on working properly… But only if there is not a deeper problem which makes the system run out of application threads once and again. In this case, as all the application threads seem to be involved with some JCo calls, we would need to check how many JCo servers are configured to ensure it is not more than the maximum number of application threads. This is, the number of all JCo servers to all destinations should be less than the number of application threads.

 

Ok, that’s all for now. The next (and last) chapter will cover some other typical problems which can be fixed by analyzing the threads. See you then!

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. veronica haze
    During the migration process problems have been reported for this blog. The blog content may look corrupt due to not supported HTML code on this platform. Please adjust the blog content manually before moving it to an official community.
    (0) 

Leave a Reply