Skip to Content

Blog #4 – System call tracing

A. When to use these tools


While ASE does most of its query processing work without using many system calls there are times when the interaction with the operating system which is done through those system calls may be an issue. The tools I’ll discuss in this blog allow us to take a look at those calls and trace what is going on. Some examples of when this tracing might be useful are:

1) When you see repeated errors in the ASE log that are related to system functions

2) When you suspect that there may be a performance issue caused by slow handling of system calls by the OS.

3) When ASE is having problems booting up and the cause is not clear from the errorlog.


B.  What to look for

        There are two primary output formats that we will be looking at. One is the full output, where the system call, the parameters to the system call, and the returned value from the OS are shown. The second is a summary of the system calls that have been made, time spent in them, and number of errors.  What you want to look at will depend upon the problem you are trying to solve; but here are the general formats for the two output types.

Full output example:

32664 open(“/lib64/libpam.so.0”, O_RDONLY) = 3

Here the first column (32264) is the dataserver process id. The system call is open(), and the parameters are the name of the file (/lib64/limpam.so.0) and the type of open being done (read only). The return value comes after the equal sign (3) and in the case of open is the file descriptor number assigned by the OS.

An example of a failing system call:

32667 stat(“/work/ase157/ASE-15_0/certificates/dputz157b.txt”, 0x1495b6c20) = -1 ENOENT (No such file or directory)

Here the pid , system call, and parameters are in the same place, but we see that the OS returned a value of -1 (which is commonly used for errors) and the error type was ENOENT – no such file or directory. If this server had intended to have a SSL certificate this would be a problem; in this case SSL has never been set up and so we would expect no certificate to be found.

Summary of system calls example:

# truss -c -p 1234 (where 1234 is a dataserver process id)

signals ————
SIGALRM          189
SIGUSR2            7
total:           196

syscall               seconds   calls  errors
write                    .197    1559
open                     .073     230
close                    .002      12
brk                     4.143   86656
fstat                    .254    6823
kill                     .040     623
ioctl                    .247    4159
fcntl                    .025     687
lwp_unpark               .000       7
getmsg                   .101    2230    209
putmsg                   .075     241
poll                     .116    2125      2
sigprocmask             2.043   64476
setcontext               .000       7

sigtimedwait             .043     189
kaio                     .006      22
kaio                     .017     533
——–  ——   —-
sys totals:             7.389  170579    211
usr time:               1.216
elapsed:               18.860

The first column is the name of the system call, the second is the number of seconds spent in that call (that is, the time taken to process the call by the OS), the third is the number of calls, and the last column is the count of errors returned to that call.

Note in this example the high percentage (4.143 out of 7.389 seconds) used for the brk() call. This was actually due to a bug in ASE 12.5.4 caused by a change made, and  corrected by CR 480690.


C.  Specific tools


The name of the program used to generate system call traces varies by platform. On Solaris and AIX it is called “truss”, on Linux it is “strace” and on HPUX it may be named either “tusc” or “truss”, depending on what version of HPUX you are running. The syntax varies slightly for each one, but in general the options are:

  • Attach to a running program (typically the “-p” option)
  • Generate the system call summary report (typically the “-c” option)
  • Trace any child processes
  • Print a timestamp on each line
  • Only select certain specified system calls to trace

You should check the documentation for your system to get complete details on the options available.

      D. Examples

1. ASE won’t boot,  and the following messages are seen in the errorlog at startup time:

00:0000:00000:00000:2014/06/30 14:12:26.83 kernel  os_find_good_keyfile: Could not open file /work/dputz/ase157/ASE-15_0/dputz157b.krg: File exists

00:0000:00000:00000:2014/06/30 14:12:26.83 kernel  kbcreate: couldn’t get shmid for kernel region.

00:0000:00000:00000:2014/06/30 14:12:26.83 kernel  kistartup: could not create shared memory

This doesn’t make a lot of sense on the surface, as it seems to be saying that it can’t open a file because it exists. So, let’s start ASE using strace (since this is on a Linux system) using the following command:

strace –f –o /tmp/trace.out RUN_servername

In this case we want the full output and we’ll use the “-o” option to send the output to a file. This is usually a very good idea, as the trace programs can generate a *lot* of output. We’ll also use the “-f” option to follow all forked processes. If that was not included all that we would get would be the output from the RUN_servername shell script, which would not be what we need for this issue.

After the bootup fails again, examining the /tmp/trace.out output shows the following two lines:

16567 open(“/work/ase157/ASE-15_0/dputz157b.krg”, O_RDWR) = -1 EACCES (Permission denied)

16567 open(“/work/ase157/ASE-15_0/dputz157b.krg”, O_RDWR|O_CREAT|O_EXCL, 0600) = -1 EEXIST (File exists)


A good hint here is that very often the “interesting” lines in a trace will be near the end of the output file. In this case, we knew from the errorlog that an open() call was failing, so we would look for the open() calls near the end of the file.

There are two open calls that returned an error here – the first one when ASE tried to open the file /work/ase157/ASE-15_0/dputz157b.krg  for read/write (O_RDWR) and the second open when it tried to create a new file by that name (O_CREAT).  (If you want more information on the parameters for any particular call the Unix “man” utility can be of great help. In this case we could run “man open” and find out what those parameters stand for.) So, we see in the first open() call a return of EACCES – permission denied. Why would ASE not have permissions to open that file, which it normally creates every time it boots? We can run an “ls –l” on it to see what the current owner and permissions are. The output looks like:

bash-4.1$ ls -l dputz157b.krg

-rw——-. 1 root root 57 Jun 30 14:11 dputz157b.krg

            

                This shows us what the problem is. Someone has started ASE while they were running as the root user (not a very good idea, usually). So, the file is now owned by root and the permissions do not allow any other users to read the file, or create a new file with the same name. That is why the two open() system calls failed, and why the boot of ASE was unsuccessful. The remedy in this case would be to have a root user remove that file, which will allow ASE to boot normally.

   

           2. ASE is performing slowly, and general tuning makes no difference

Here we ran “truss –c” to get a look at the overall system call distribution and timing.

truss -c -p 12347

^Csignals ————

SIGALRM         1538

SIGUSR2            6

total:          1544

syscall               seconds   calls errors

close                    .000       1

kill                     .175    8193

ulimit                   .000       3

lwp_park                 .008     155

lwp_unpark               .005     155

setcontext               .000       6

sigtimedwait             .095    1538

lwp_sigmask              .000       6

kaio                     .007      73

kaio                     .017     780 613

pollsys                 1.811   85665 6

getrlimit64              .000       3

pwrite64                 .039     155

send                     .000       1

                     ——–  —— —-

sys totals:             2.159   96734 619

usr time:               1.891

elapsed:              153.890

The syscall column shows the  system call names, if you are unsure about what any of them do you can again use the Unix “man” command to find out.

We see here that there looks to be a large number of “pollsys” calls. These are a result of ASE doing poll() calls to check for completed async I/O. But the number of actual I/O calls (kaio) is not high. This tells us that ASE is spending a lot of time polling for disk I/O to complete. We could at this point perhaps use iostat (see my second blog on this topic for details) or ASE queries on MDA tables such as monDeviceIO or monIOQueue to determine if perhaps there is a slow or overloaded disk device.

Another good thing to look at in the “-c” output is the errors column. We see that there were a high percentage of kaio calls that returned an error. This output does not detail what those errors might have been, but if we ran another truss using the full output options we would see lines like:

kaio(AIOWAIT, 0xFFFFFFFFFFFFFFFF) Err#22 EINVAL

Solaris doesn’t have a man page for kaio(), but it does have one for aiowait(); which tells us that EINVAL is returned when aiowait() is called and there are no outstanding async I/Os. So, this error would appear to be normal.

        3. Messages in the ASE errorlog related to system calls

In this example, the errorlog was being filled up with messages like:

        00:00000:00000:2013/10/19 12:28:38.76 kernel  ncheck: select, Bad file descriptor

strace was attached to the running server with the following command:

strace –f –o /tmp/trace.out –p 30375

After a few seconds, strace was stopped (using ctrl-c) and the resulting output file was analyzed. It showed many repeated messages such as:

     30375 select(20, [16 17 18 19], NULL, NULL, {0, 0}) = -1 EBADF (Bad file descriptor)

This matches the error from the ASE errorlog. However, the additional information from the strace output shows the file descriptor numbers (16 17 18 and 19) being passed to the select() call. It was noted that there were only two open connections to ASE at the time this trace was done.  This clearly showed that the problem was an ASE routine that was not properly clearing disconnected file descriptors. This information was passed to SAP Product Support, and resulted in a fix from engineering   (this was CR 699429).

      F. Conclusion

Using the Unix system call tracing utilities can aid in identifying issues related to ASE’s use of system calls. These issues may be able to be corrected on the system, or they may require a fix to ASE. Using these tools can help provide us with a better view of what is going on and let us find and fix issues; often without needing to contact SAP Product Support.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply