Program profiling


    A. When to use these tools

The tools I will be looking at in this post are a little more esoteric than some. However, when used properly they can give a very detailed look at what an application (such as ASE) is doing. Generally, these tools are best suited for performance issues, especially when the problem is high CPU usage. Program profiling will show exactly what routines are using the most CPU, and can help determine what might help reduce that CPU usage. Note that these tools are also very useful for developers; but that usage is outside the scope of what I’ll be discussing.

    B. What to look for

In its simplest form a program profile is a report showing what functions in a program are being called the most often and are using the most time. When CPU usage is high it is these functions that will be the most interesting to look at. While you may have to guess at what some of the functions do, based on their names, many names are clear enough that you can get a good idea of what is going on. You can also, of course, contact SAP support for more help, if needed.

As an example, here is some output from the oreport utility (part of oprofile) on a Linux box:

samples %        image name               symbol name

48818 4.8898  dataserver               bufsearch_cache_getlatch

33586 3.3641  dataserver               dol_qualrow

32516 3.2569  dataserver               getpage_with_validation

28859 2.8906  dataserver               s_execute

27724 2.7769  dataserver               ksMuxSchedulerFindTask

  19796 1.9828  dataserver               LeRun

  15713 1.5739  dataserver               bufunkeep_cache

Here we can see that ASE was spending a lot of time in routines associated with data cache (bufsearch. getpage). So, we might want to look at the cache configuration on this server to make sure that the caches are sized properly.

It can also be informative to know what doesn’t need looking at. For instance, in the example above the spinlock routine (atomic_try on Linux) only showed less than 1% of overall cpu:

           9308 0.9323  dataserver               atomic_try

So, I would probably not spend much time trying to optimize spinlock usage on this server.

    C. What tools to use

I’ll only be discussing those tools that can be used without recompiling. There are tools (such as prof, gprof, etc.) that depend upon compiler hints in order to do their profiling. Since ASE is not  able to be recompiled  these tools will not be available. Some of the tools that can be used are:

  • Linux – oprofile & perf
  • Solaris – dtrace & pprof
  • AIX – trace & tprof

Note that these tools may need to be run as the root user. That is because they may directly access the CPU registers to get the current instruction pointer. If there are separate reporting programs (such as oreport) they can usually be run as a normal user.

These tools can require some setup to use. I have created some sample scripts and setup instructions.

They can be found at System program profiling scripts. Please read the comment in the scripts as well, as they will detail whether or not the tools need to be run as root.

It should also be noted that most of these tools will take a process id and only look at that single process. If you are running an ASE earlier than version 15.7, or if you are running 15.7 or later in process mode; then there will be one Unix process id for each engine. This means you may need to profile more than one time to look at multiple engines. When running in threaded mode (15.7 and later) then you will only need to profile the single process id, and the profiling will pick up all the engines as well as the I/O controller threads.

   D. Examples

  1. An ASE server is experiencing very high CPU usage at intervals throughout the day. It does not seem to be related to actual work getting done; but it does occur when large numbers of users are logging in. This was on Linux, so oprofile was run during one of the cpu spikes. Using the command “opreport -l –merge tgid /Sybase/ASE-15_0/bin/dataserver” the following top lines were seen:

CPU: Intel Core/i7, speed 2799.9 MHz (estimated)

Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000

samples  % image name               symbol name

11534    25.3454 dataserver atomic_try

1956      4.2981 dataserver lock_do_logical

1361      2.9906 dataserver               ksMuxSchedulerFindTask

1350      2.9664 dataserver bufsearch_cache_getlatch

1299      2.8544 dataserver lock__ins_logical

1268      2.7863 dataserver trim_blanks

1104      2.4259 dataserver               bt__qualpage

This showed that the ASE spinlock routine (which is named atomic_try on Linux) was using 25% of the cpu cycles on the system. However, this trace doesn’t show which spinlock (or spinlocks) caused the contention. Opreport was re-run on the same sample with the following command: “opreport – callgraph –merge tgid /Sybase/ASE-15_0/bin/dataserver”. It was now possible to look for the largest value of  atomic_try in that output, and what was seen was:

samples  % image name               symbol name


  9       0.7103 dataserver               make_fake

   1258 99.2897  dataserver               ins_sysproc

10236   22.4927 dataserver               atomic_try

  10236 100.000  dataserver               atomic_try [self]

This showed that a large percentage of the spinlock contention was coming while querying the sysprocesses table. It turned out that the application was making many queries against that table in a login trigger, most of which were not needed. Reducing those unnecessary queries dropped CPU usage back to acceptable levels.

      2.  On a Solaris system, it was noticed that one engine was pegged at 100% while other engines were running at normal levels.  dtrace was run against the engine using the dapprof script (which can be found as part of the scripts at System program profiling scripts ). The output showed the following “most used” functions:

CALL                                                         COUNT

hk_acct_stats_enabled                                          138

hk_relinquish                                                  138

ind__copy_row                                                  138

lock__remove_cachelock                                         138

no__data_sargs                                                 138

ubo_init_slotscan_ctx                                          138

xact_check_detach_timeout                                      138

xact_getifstarted                                              138

sparclock                                                    12359

ubcheck                                                     137033

ubo_slotscan_getnext                                        137259

This trace enabled the issue to be identified as known CR 761787 and appropriate action taken.

    E. Conclusion

While program profiling may not be a good choice for the first tool used to analyze performance issues, it can be a valuable addition when other tools are not able to pinpoint the problem. By determining which functions are being called the most often you can see where ASE is spending its time, and so decide what steps might be useful in helping reduce CPU usage and improve overall performance.

This is the last of the blogs that I have prepared on system debugging and analysis techniques for ASE on Unix/Linux. I hope they will be useful to you; I would be glad to answer any questions or comments you may have.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply