System debugging and analysis techniques for ASE o...

Former Member · ‎06-20-2014

System performance monitoring

When to use these tools.

It is often useful to take an overall look at the system when there are ASE performance issues. When the system is experiencing any sort of resource shortage or contention it can directly affect the ability of ASE to perform properly. Identifying these resource bottlenecks can allow ASE to work at optimum levels. Typically making tuning changes to ASE to reduce resource consumption can help in these cases; but it can be quite difficult to identify where the contention exists using the diagnostic tools inside ASE. For instance, if the system is running into a lack of CPU resources, it may show up in ASE as high CPU busy. But, if the reaction to high CPU busy is to increase the number of ASE engines; the situation will be made worse, as it will simply increase the overall CPU contention on the system.

There are 4 main areas of system resources that will be looked at.

CPU
Memory
Disk
Network

Each of these areas has specific tools that can used to measure and evaluate performance. Note that there are also many third-party system performance tools available; we will not be discussing those, but most of the times they are going to be reporting metrics that you can relate to the ones provided by the system tools. For example, memory paging is the same whether you see it from vmstat output or a nice graph generated by some third-party tool.

I will not be discussing each field in these outputs (there are a lot) but rather will go through the most important metrics when diagnosing each system resource. If you are interested in a complete discussion of the fields, I would suggest using the “man <command>” in order to get complete documentation.

I have put together some scripts for vafrious Unix and Linux platforms that run the commands whose output is discussed below. They can be found

on SCN at System debugging and analysis techniques scripts

1. CPU

There are several key numbers to look at when analyzing CPU usage. The high-level view can start with user cpu%, system cpu% and idle%. These can be found, for instance, in the output from the vmstat .sh script. That output might look something like:

kthr memory page disk faults cpu

r b w swap free re mf pi po fr de sr s0 sd sd sd in sy cs us sy id

1 0 0 78426408 47056720 214 264 1134 73 73 0 0 0 5 5 5 6040 141463 3626 8 3 89

3 0 0 64302616 28740520 1 2 0 0 0 0 0 0 0 0 1 3997 865005 4517 17 8 75

3 0 0 64302576 28740560 0 0 0 1 1 0 0 0 0 0 0 3948 855812 4526 17 7 77

3 0 0 64302648 28740640 5 41 0 0 0 0 0 0 0 0 1 4135 859738 4619 17 7 76

2 0 0 64302768 28740768 0 0 0 0 0 0 0 0 0 0 1 4093 857408 4600 17 7 76

2 0 0 64302280 28740248 8 51 0 1 1 0 0 0 0 0 0 4132 852127 4581 17 7 76

3 0 0 64301992 28739984 15 121 0 2 2 0 0 0 0 0 0 4184 856176 4715 17 7 76

3 0 0 64301504 28739528 0 0 0 0 0 0 0 0 0 0 1 3942 864103 4517 17 7 76

2 0 0 64301496 28739528 2 13 0 1 1 0 0 0 0 0 0 3973 857463 4566 17 7 76

2 0 0 64299944 28737896 441 1775 0 0 0 0 0 0 0 0 0 4207 853593 5116 18 8 74

A couple points to look at here: 1) On many platforms the first line shows the averages since the system was last booted. That means you can probably ignore it when looking at current issues. 2) The last 3 columns are user busy % (us), system busy % (sy) and idle % (id). In this particular case, we can see that the system is roughly 75% idle presently, indicating a large amount of reserve cpu cycles. But what this doesn’t show us is if the cpu usage is being spread evenly across all cpus or perhaps we have a few very busy ones. For that, look at the mpstat.sh output. One sample interval might look like:

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl

0 0 0 15 206 106 185 55 44 9 0 180725 74 17 0 9

1 10 0 8 258 127 326 7 63 8 0 8412 5 2 0 93

3 0 0 6 346 169 458 7 61 8 0 4341 2 2 0 96

4 22 0 10 222 107 267 5 43 10 0 11051 4 2 0 94

5 0 0 7 184 76 226 18 81 13 0 141239 36 15 0 50

6 0 0 3 456 273 398 6 41 7 0 6665 2 2 0 96

7 9 0 235 421 139 437 12 101 16 0 49070 10 6 0 83

16 0 0 6 153 65 169 56 39 6 0 168353 81 16 0 3

17 0 0 3 262 126 363 8 55 9 0 13042 4 3 0 93

19 00 3 321 156 420 7 57 9 0 4608 1 1 0 98

20 0 0 3 227 104 273 5 38 9 0 10610 5 5 0 90

21 0 0 10 159 65 192 12 68 9 0 204277 37 23 0 40

22 0 0 2 440 147 369 6 39 7 0 8532 2 1 0 97

23 0 0 9 391 174 531 16 108 16 0 46734 9 6 0 85

Here we see that the system has 14 cpus; two of which are < 10% idle; two around 40-50% idle; and the rest nearly completely idle. The overall average is about 75% idle, but we may have a couple very busy jobs that are using up nearly a complete CPU each. This tells us that we may have to do some more detailed analysis to determine what those jobs are (which will be a topic in a future blog).

The ratio of user cpu %(usr) to system cpu %(sys) is also important. In general, to process a typical query ASE does not make a lot of system calls. Therefore, we would expect to see system cpu % fairly low (it is the percentage of time spent in handling system calls). If it is high it may mean that either ASE is making an unexpectedly large number of system calls or that the OS is not handling them very efficiently.

Correlating the overall cpu busy with ASE “busyness” can be a challenge. The measures that ASE uses differ from the OS, in that ASE counts being in the scheduler looking for work as being idle. But, it may still be using cpu cycles to search for runnable processes or making system calls to poll for completed disk I/Os or incoming network packets. As a result, the system would view the engine as busy but ASE metrics would show it as either Idle or I/O Busy. In addition, when using threaded mode in versions 15.7 and newer, ASE will show up as a single process (with multiple threads) to the OS. It can become even more difficult to separate things out when there are other jobs running on the server that are using CPU. I’ll discuss in later on how to help separate things out.

2. Memory

The memory in a Unix system can be divided up into two main categories – memory used by the OS and memory used by processes. The biggest concern when looking at memory usage is seeing if there is enough memory contention going on to cause performance degradation. We can start by again using vmstat output, for instance, to get a general overall view of the system. The columns will vary by system type, but here is a sample Linux output:

Procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------

r b swpd free buff cache si so bi bo in cs us sy id wa st

6 1 660 14931620 100688 21177656 0 0 801 234 1 1 13 1 84 2 0

6 1 660 14930736 100696 21177648 0 0 18940 4613 4204 15503 35 2 60 3 0

6 1 660 14931612 100700 21177676 0 0 21538 3534 3993 14152 34 2 60 5 0

7 0 660 14931984 100712 21177668 0 0 14154 3556 4033 13898 32 1 62 5 0

7 1 660 14931964 100712 21177680 0 0 14618 4522 4346 15921 33 2 61 4 0

5 1 660 14932324 100716 21177684 0 0 11058 4319 4025 14549 30 1 63 6 0

5 1 660 14932628 100724 21177672 0 0 12654 5710 4392 16194 30 2 62 5 0

5 1 660 14933004 100728 21177680 0 0 11048 2845 4018 13458 32 1 64 4 0

4 1 660 14933252 100728 21177680 0 0 10880 3587 4222 14738 28 1 67 4 0

5 1 660 14933872 100736 21177680 0 0 11352 5302 4724 17347 31 2 64 4 0

The most important columns to look at here are the “si” and “so”. These indicate paging in (si) and out (so) to swap space on disk. Ideally, these values should always be 0. If you see any non-zero values at all it indicates some amount of memory contention; and the higher the values the worse the contention was. Many modern OSes attempt to use as much memory as is available for functions such as file system buffer cache; so do not be too concerned if you see the “free” column have a relatively low value. As long as there is not paging going on a low free value will not be hurting performance; and can, in fact, be improving performance by having the OS make good use of more of the system memory.

The vmstat.sh scripts also use the “-s” option to show collected memory statistics. In general, these values are the accumulated totals since the last system boot. As such, they will not show immediate issues, but then can be very useful in determining if there has been any memory contention since the system was last booted. A partial example from Solaris on a system with no memory contention shows:

0 swap ins

0 swap outs

0 pages swapped in

0 pages swapped out

473997441 total address trans. faults taken

35643878 page ins

2140018 page outs

254314432 pages paged in

16457040 pages paged out

…

Note that a distinction is made between memory pages “swapped out” and pages “paged out”. The “paged out” pages will include pages that were in the file system buffer cache that were written to disk, which is a normal function of the OS, so they do not indicate any contention. An example from a Linux system that has seen memory contention shows:

24734348 total memory

13923988 used memory

1477148 active memory

11455092 inactive memory

10810360 free memory

346920 buffer memory

12267784 swap cache

25165816 total swap

26636 used swap

25139180 free swap

...

2302227251 pages paged in

8418742410 pages paged out

65435 pages swapped in

1923934 pages swapped out

...

Here we see a significant number of pages have been swapped out; but the used swap space is not very large, indicating that the contention occurred sometime in the past. If you see pages getting swapped in and out on a regular basis you can be sure that overall system performance is being impacted by memory contention. We will look later on at some tools to help us determine what processes are using the memory.

3. Disk

There are two primary measurements that need to be looked at for disk I/O. One is the overall amount of traffic to a particular device, and the second is the average response time taken by the disk subsystem to complete a request. On most platforms the iostat tool can be used to show these values (on HPUX we would use sar with the “-d” option instead).

Here is an iostat example output when there was very heavy write activity on a couple of disks:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sda 0.00 1.00 0.00 1.20 0.00 16.00 13.33 0.01 4.83 3.83 0.46

sdb 0.00 0.00 0.00 932.20 0.00 224614.40 240.95 0.96 1.03 1.03 95.66

sdc 0.00 0.00 0.00 933.20 0.00 224870.40 240.97 0.47 0.51 0.51 47.14

dm-0 0.00 0.00 0.00 2.00 0.00 16.00 8.00 0.01 2.90 2.30 0.46

Here we can see that sdb and sdc are seeing heavy writes, but the average service times (svctm) are quite low, 1 millisecond or less. The only way to improve disk speeds with this profile would be to move load to other disks.

*Note* According to the iostat man page on recent Linux versions, the service time should not be used. Instead, use the average wait time (await) as the

measurement of performance. This is due to how the Linux kernel collects disk statistics.

From a different platform, we can see what high service times look like:

r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t %w %b device
2.8   162.7   7.0 649.5 0.0 25.4    0.0 153.3   0 100 md/d1
53.1 297.5 306.2 912.0 0.0 46.3    0.0 102.8   0 100 md/d11
82.4 142.6 181.7 442.4 0.0 25.2    0.0 111.9   0 100 md/d21

The service times here (asvc_t) are quite high – over 100 milliseconds. While it is difficult to generalize, since different types of disk devices have different speeds, a general rule of thumb for disk average service times is:
< 10 milliseconds = OK (though very fast disks such as SSD should only be 1-2 milliseconds)
10-20 milliseconds = maybe OK, but could be a problem
>20 milliseconds = there is a disk bottleneck

Slower or overloaded disks can obviously have a large impact on any database server performance. If an ASE seems to be running a bit sluggish, it is always worthwhile to check to make sure there is not some disk issue contributing to slower performance.

4. Network

Since many network issues, such a proper routing and problems due to traffic load, take place in the network hardware it can be difficult to diagnose such issues from the system. The best tool to take a look at what the system sees is netstat. The netstat.sh script will show some of the more useful metrics.

The first output from netstat.sh shows the traffic and error levels for each network interface

Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg

eth0 1500 0 1349503855 0 0 0 1574347028 0 0 0

eth1 1500 0 0 0 0 0 6 0 0 0

eth2 1500 0 9107774 0 0 0 294 0 0 0

eth3 1500 0 743127146 0 0 0 7354573006 0 0 0

Note that on most platforms the netstat output columns do not line up very well. This is mostly due to the very high values that some columns may have. Here we see 4 interfaces, with one of them being almost never used, but we don’t see any errors (RX-ERR and TX_ERR) or drops (RX_DRP and TX_DRP) being reported. Ideally, that is what should show up – errors in this output generally indicate a problem with the interface itself.

We can also get metrics on a per-protocol basis from netstat. Here is an example of TCP values on Solaris:

TCP tcpRtoAlgorithm = 4 tcpRtoMin = 400

tcpRtoMax = 60000 tcpMaxConn = -1

tcpActiveOpens =2681247 tcpPassiveOpens =358412

tcpAttemptFails =2305528 tcpEstabResets =183254

tcpCurrEstab = 198 tcpOutSegs =250743921

tcpOutDataSegs =1896837601 tcpOutDataBytes =3429284834

tcpRetransSegs =136413 tcpRetransBytes =182285123

tcpOutAck =42999509 tcpOutAckDelayed =864043

tcpOutUrg = 334 tcpOutWinUpdate = 509

tcpOutWinProbe = 4890 tcpOutControl =5976598

tcpOutRsts =2385762 tcpOutFastRetrans = 16

tcpInSegs =181700271

tcpInAckSegs =892374093 tcpInAckBytes =2170564957

tcpInDupAck =1820706 tcpInAckUnsent = 0

tcpInInorderSegs =181655424 tcpInInorderBytes =1103918392

tcpInUnorderSegs =288318 tcpInUnorderBytes =388521280

tcpInDupSegs = 12909 tcpInDupBytes =609222

tcpInPartDupSegs = 70 tcpInPartDupBytes = 23647

tcpInPastWinSegs = 0 tcpInPastWinBytes = 0

tcpInWinProbe = 70 tcpInWinUpdate = 4871

tcpInClosed = 1034 tcpRttNoUpdate =609634

tcpRttUpdate =891264817 tcpTimRetrans =1393901

tcpTimRetransDrop = 65 tcpTimKeepalive = 20402

tcpTimKeepaliveProbe= 4744 tcpTimKeepaliveDrop = 53

tcpListenDrop = 0 tcpListenDropQ0 = 0

tcpHalfOpenDrop = 0 tcpOutSackRetrans =123211

Yes, there are a *lot* of numbers there. The two main values that tell us whether or not we are seeing network issues are retransmissions (tcpTimRetrans, tcpRetransSegs) and drops (tcpTimRetransDrop). While these values can be non-zero; any time we see retransmissions or drops we know that there have been some network issues that have slowed down the responses to users. It’s mostly a matter of percentages (i.e. the ratio of tcpRetransSegs to tcpOutDataSegs). In this case we see that is a little more than .007 percent of all packets; which is low enough to not be a big issue. Generally if this ratio starts to hit 1% or higher it would indicate that the network may have a problem.

I hope these examples can help you make some determinations about whether your system is running with resource contention; and, if so, where you might start lookinbg to improve performance.