System debugging and analysis techniques for ASE on Unix/Linux – Part 7
Program profiling
A. When to use these tools
The tools I will be looking at in this post are a little more esoteric than some. However, when used properly they can give a very detailed look at what an application (such as ASE) is doing. Generally, these tools are best suited for performance issues, especially when the problem is high CPU usage. Program profiling will show exactly what routines are using the most CPU, and can help determine what might help reduce that CPU usage. Note that these tools are also very useful for developers; but that usage is outside the scope of what I’ll be discussing.
B. What to look for
In its simplest form a program profile is a report showing what functions in a program are being called the most often and are using the most time. When CPU usage is high it is these functions that will be the most interesting to look at. While you may have to guess at what some of the functions do, based on their names, many names are clear enough that you can get a good idea of what is going on. You can also, of course, contact SAP support for more help, if needed.
As an example, here is some output from the oreport utility (part of oprofile) on a Linux box:
samples % image name symbol name
48818 4.8898 dataserver bufsearch_cache_getlatch
33586 3.3641 dataserver dol_qualrow
32516 3.2569 dataserver getpage_with_validation
28859 2.8906 dataserver s_execute
27724 2.7769 dataserver ksMuxSchedulerFindTask
19796 1.9828 dataserver LeRun
15713 1.5739 dataserver bufunkeep_cache
Here we can see that ASE was spending a lot of time in routines associated with data cache (bufsearch. getpage). So, we might want to look at the cache configuration on this server to make sure that the caches are sized properly.
It can also be informative to know what doesn’t need looking at. For instance, in the example above the spinlock routine (atomic_try on Linux) only showed less than 1% of overall cpu:
9308 0.9323 dataserver atomic_try
So, I would probably not spend much time trying to optimize spinlock usage on this server.
C. What tools to use
I’ll only be discussing those tools that can be used without recompiling. There are tools (such as prof, gprof, etc.) that depend upon compiler hints in order to do their profiling. Since ASE is not able to be recompiled these tools will not be available. Some of the tools that can be used are:
- Linux – oprofile & perf
- Solaris – dtrace & pprof
- AIX – trace & tprof
Note that these tools may need to be run as the root user. That is because they may directly access the CPU registers to get the current instruction pointer. If there are separate reporting programs (such as oreport) they can usually be run as a normal user.
These tools can require some setup to use. I have created some sample scripts and setup instructions.
They can be found at System program profiling scripts. Please read the comment in the scripts as well, as they will detail whether or not the tools need to be run as root.
It should also be noted that most of these tools will take a process id and only look at that single process. If you are running an ASE earlier than version 15.7, or if you are running 15.7 or later in process mode; then there will be one Unix process id for each engine. This means you may need to profile more than one time to look at multiple engines. When running in threaded mode (15.7 and later) then you will only need to profile the single process id, and the profiling will pick up all the engines as well as the I/O controller threads.
D. Examples
- An ASE server is experiencing very high CPU usage at intervals throughout the day. It does not seem to be related to actual work getting done; but it does occur when large numbers of users are logging in. This was on Linux, so oprofile was run during one of the cpu spikes. Using the command “opreport -l –merge tgid /Sybase/ASE-15_0/bin/dataserver” the following top lines were seen:
CPU: Intel Core/i7, speed 2799.9 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
11534 25.3454 dataserver atomic_try
1956 4.2981 dataserver lock_do_logical
1361 2.9906 dataserver ksMuxSchedulerFindTask
1350 2.9664 dataserver bufsearch_cache_getlatch
1299 2.8544 dataserver lock__ins_logical
1268 2.7863 dataserver trim_blanks
1104 2.4259 dataserver bt__qualpage
This showed that the ASE spinlock routine (which is named atomic_try on Linux) was using 25% of the cpu cycles on the system. However, this trace doesn’t show which spinlock (or spinlocks) caused the contention. Opreport was re-run on the same sample with the following command: “opreport – callgraph –merge tgid /Sybase/ASE-15_0/bin/dataserver”. It was now possible to look for the largest value of atomic_try in that output, and what was seen was:
samples % image name symbol name
————————————————————–
…
9 0.7103 dataserver make_fake
1258 99.2897 dataserver ins_sysproc
10236 22.4927 dataserver atomic_try
10236 100.000 dataserver atomic_try [self]
…
This showed that a large percentage of the spinlock contention was coming while querying the sysprocesses table. It turned out that the application was making many queries against that table in a login trigger, most of which were not needed. Reducing those unnecessary queries dropped CPU usage back to acceptable levels.
2. On a Solaris system, it was noticed that one engine was pegged at 100% while other engines were running at normal levels. dtrace was run against the engine using the dapprof script (which can be found as part of the scripts at System program profiling scripts ). The output showed the following “most used” functions:
CALL COUNT
…
hk_acct_stats_enabled 138
hk_relinquish 138
ind__copy_row 138
lock__remove_cachelock 138
no__data_sargs 138
ubo_init_slotscan_ctx 138
xact_check_detach_timeout 138
xact_getifstarted 138
…
sparclock 12359
ubcheck 137033
ubo_slotscan_getnext 137259
This trace enabled the issue to be identified as known CR 761787 and appropriate action taken.
E. Conclusion
While program profiling may not be a good choice for the first tool used to analyze performance issues, it can be a valuable addition when other tools are not able to pinpoint the problem. By determining which functions are being called the most often you can see where ASE is spending its time, and so decide what steps might be useful in helping reduce CPU usage and improve overall performance.
This is the last of the blogs that I have prepared on system debugging and analysis techniques for ASE on Unix/Linux. I hope they will be useful to you; I would be glad to answer any questions or comments you may have.