Sponsored Content
Homework and Emergencies Emergency UNIX and Linux Support Performance investigation, very high runq-sz %runocc Post 302507446 by Solarius on Wednesday 23rd of March 2011 10:47:27 PM
Old 03-23-2011
Performance investigation, very high runq-sz %runocc



I've just been handed a hot potato from a colleague who left Smilie... our client has been complaining about slow performance on one of our servers.
I'm not very experienced in investigating performance issues so I hoping someone will be so kind to provide some guidance

Here is an overview of the system:

-running Solaris 10 SPARC, multiple Sybase instances & apps (java, perl, financial software).

-kernel version: Generic_142900-13
Code:
$ uptime
1:23pm  up 13 day(s), 17:34,  19 users,  load average: 21.75, 22.65, 25.14

Huge amount of memory & CPUs:
# prtdiag -v
System Configuration:  Sun Microsystems  sun4u Sun Fire E25K
System clock frequency: 150 MHz
Memory size: 163840 Megabytes

========================= CPUs =========================

          CPU      Run    E$    CPU     CPU
Slot ID   ID       MHz    MB   Impl.    Mask
--------  -------  ----  ----  -------  ----
/SB00/P0    0,  4  1800  32.0  US-IV+   2.2
/SB00/P1    1,  5  1800  32.0  US-IV+   2.2
/SB00/P2    2,  6  1800  32.0  US-IV+   2.2
/SB00/P3    3,  7  1800  32.0  US-IV+   2.2
/SB01/P0   32, 36  1350  16.0  US-IV    3.1
/SB01/P1   33, 37  1350  16.0  US-IV    3.1
/SB01/P2   34, 38  1350  16.0  US-IV    3.1
/SB01/P3   35, 39  1350  16.0  US-IV    3.1
/SB04/P0  128,132  1800  32.0  US-IV+   2.2
/SB04/P1  129,133  1800  32.0  US-IV+   2.2
/SB04/P2  130,134  1800  32.0  US-IV+   2.2
/SB04/P3  131,135  1800  32.0  US-IV+   2.2
/SB05/P0  160,164  1800  32.0  US-IV+   2.2
/SB05/P1  161,165  1800  32.0  US-IV+   2.2
/SB05/P2  162,166  1800  32.0  US-IV+   2.2
/SB05/P3  163,167  1800  32.0  US-IV+   2.2
/SB08/P0  256,260  1350  16.0  US-IV    3.1
/SB08/P1  257,261  1350  16.0  US-IV    3.1
/SB08/P2  258,262  1350  16.0  US-IV    3.1
/SB08/P3  259,263  1350  16.0  US-IV    3.1

But even with all that CPU power, the system still seems to be choking:
# sar -q

SunOS aubbwsyd01 5.10 Generic_142900-13 sun4u    03/24/2011

00:00:01 runq-sz %runocc swpq-sz %swpocc
00:05:02    26.4      72     0.0       0
00:10:02    25.9      71     0.0       0
00:15:02    27.4      73     0.0       0
00:20:01    27.3      62     0.0       0
00:25:01    25.5      66     0.0       0
00:30:02    26.9      75     0.0       0
00:35:01    36.1      60     0.0       0
00:40:02    28.5      64     0.0       0
00:45:01    30.6      58     0.0       0
00:50:02    30.0      64     0.0       0
00:55:02    30.4      59     0.0       0
01:00:02    26.7      64     0.0       0
...
12:45:02    29.5      78     0.0       0
12:50:01    27.4      90     0.0       0
12:55:01    29.7      79     0.0       0
13:00:03    30.7      76     0.0       0
13:05:01    30.4      86     0.0       0
13:10:03    34.6      81     0.0       0
13:15:01    26.8      84     0.0       0
13:20:02    30.4      77     0.0       0
13:25:01    31.6      72     0.0       0

Average     29.5      69     0.0       0

# sar -r

SunOS aubbwsyd01 5.10 Generic_142900-13 sun4u    03/24/2011

00:00:01 freemem freeswap
00:05:02  586184 110438515
00:10:02  562080 113580170
00:15:02  547328 111934356
00:20:01  577790 111795786
00:25:01  597018 112950564
00:30:02  630584 110620673
00:35:01  649792 113179258
00:40:02  662950 110557264
00:45:01  658017 113512159
00:50:02  633167 110902038
00:55:02  644952 113924963
01:00:02  610516 112041306
...
12:45:02  348721 97869521
12:50:01  340880 96804395
12:55:01  339169 98490899
13:00:03  327440 99308450
13:05:01  336337 97280372
13:10:03  341150 99300626
13:15:01  345920 98246498
13:20:02  369102 99563900
13:25:01  387421 99101277

Average   627886 118480917

#mpstat 5 2
... (2nd iteration below)
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0 2152   1 26484   926  336 1593  276  649  976    5 14654   34  44   0  22
  1 2056   1 32114   796  285 1322  254  597  958    9 16580   38  43   0  20
  2 1715   1 25972   888  323 1578  262  602  822    3 22862   33  46   0  21
  3 1706   2 29307   724  279 1183  197  515  820    6 19937   40  39   0  21
  4 1378   0 25992   816  313 1464  211  564  779    1 16577   43  35   0  22
  5 1587   1 28487   808  302 1420  237  571  930    5 20051   31  48   0  21
  6 1429   1 19215   765  286 1338  207  521  830    3 21779   38  39   0  24
  7 1547   0 22940   801  293 1497  234  557  820    2 19536   35  44   0  22
 32 1217   2 15876  1314  641 1125  287  555  574    3  5699   31  57   0  12
 33 1304   3 23066   870  303 1469  307  664  603    3  7398   38  47   0  15
 34 1459   1 25564   951  337 1565  330  691  660    3  8834   32  51   0  16
 35 1282   2 22116   898  340 1565  280  633  585    3  7867   36  47   0  17
 36 1255   1 20946   802  286 1296  285  583  567    3  9369   30  61   0   9
 37 1348   0 23823   813  297 1426  260  581  601    3  7670   32  51   0  17
 38 1028   1 21024   810  296 1434  258  588  551    4  6874   32  51   0  17
 39 1065   1 21564   706  270 1321  192  512  771    1  7690   36  47   0  17
128 1517   1 25091  1059  375 1535  371  733  860    2 27353   41  44   0  16
129 1707   1 27668   927  334 1448  308  673  823    2 20142   39  44   0  17
130 1376   2 23294   866  318 1349  282  624  745    3 26822   37  46   0  17
131 1238   4 20804   895  322 1425  325  610  744    3 32165   46  39   0  15
132 1169   1 24721   780  283 1264  262  535  798    3 31841   47  39   0  14
133 1339   0 20148   789  289 1202  256  537  928    1 30757   46  41   0  13
134 1134   2 21571   862  315 1372  279  587  812    2 32827   46  38   0  16
135 1296   2 19052   898  331 1437  293  601  680    2 28036   43  39   0  18
160 1151   0 20643   730  241 1027  292  470 1065    3 57836   57  36   0   8
161 1094   0 13299   848  297 1188  323  473 1050    3 58257   45  46   0  10
162 1245   0 15682   923  330 1221  370  477  778    3 53849   49  42   0   9
163  927   0 9607   845  297 1145  370  423  678    2 69122   55  39   0   6
164  560   0 14091  4496 4033 1016  276  380  515    2 50642   50  42   0   9
165  675   0 18376  1595 1135 1002  259  377  662    2 62744   52  36   0  12
166  593   0 9206   901  331 1215  375  421  529    2 81789   59  33   0   8
167  838   0 24495   733  267  958  279  361  566    2 54789   53  35   0  12
256 1409   4 20748   878  309 1192  282  560  546    3 17693   36  49   0  16
257 1363   4 19532   848  298 1201  305  522  566    3 24880   39  48   0  13
258 1252   2 27165   865  322 1192  267  507  644    5 27032   32  52   0  15
259 1089   0 18189   902  379 1211  252  480  490    2 26119   36  47   0  17
260 1249   4 19819  1018  397 1508  303  570  468    3 28197   34  45   0  21
261 1081   6 18595   807  326  985  241  447  490    2 29507   34  51   0  14
262 1065   3 16197   882  351 1290  251  478  471    2 32525   33  48   0  19
263 1095   2 21474  1308  791 1218  237  477  562    3 26501   32  49   0  19

#top
last pid: 13141;  load avg:  24.9,  25.3,  25.3;       up 13+17:46:55                                                  13:36:00
1399 processes: 1382 sleeping, 1 running, 1 zombie, 15 on cpu
CPU states: 56.7% idle, 24.5% user, 18.8% kernel,  0.0% iowait,  0.0% swap
Memory: 160G phys mem, 3221M free mem, 281G swap, 276G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  3035 sybdev   176   0    0   12G   12G cpu    115.2H   122% dataserver
 15934 sybase   264   0    0   12G   12G cpu    143.8H   108% dataserver
 15440 sybase   264   0    0   12G   12G cpu    170.9H 98.04% dataserver
  5436 sybdev   158   0    0   12G   12G cpu    195.5H 97.95% dataserver
 15932 sybase   264   0    0   12G   12G cpu     50.0H 97.94% dataserver
  2860 sybdev   264   0    0   12G   12G cpu    186.6H 88.24% dataserver
 15955 sybase   264   0    0   12G   12G cpu     26.6H 79.29% dataserver
 15966 sybase   264   4    0   12G   12G sleep   34.4H 59.64% dataserver
  2902 sybdev   264   0    0   12G   12G cpu    101.1H 59.48% dataserver
 15937 sybase   264   0    0   12G   12G cpu    140.2H 41.35% dataserver
 19421 appdev   1   0    0  443M  411M sleep  836:14 33.02% perl
 12074 appdev 999  59    0 3002M 2817M sleep   33.3H 31.77% java
 24636 appdev 999  59    0  485M  432M sleep   18:12 31.40% java
 27539 appdev   1   0    0 1843M 1655M cpu     46.6H 29.13% perl
 10297 appdev   1   0    2   39M   19M cpu    104:16 28.15% perl


So I just can't figure out where these huge runq's are coming from... can someone please tell me what I'm missing or what would be the next thing to check?
Maybe it's staring me right in the face but I just don't see it Smilie

Many thanks in advance!!

Last edited by Perderabo; 03-25-2011 at 04:23 PM..
 

6 More Discussions You Might Find Interesting

1. AIX

Performance Problem - High CPU utilization

Hello everybody. I have a problem with my AIX 5.3. Recently my unix shows a high cpu utilization with sar or topas. I need to find what I have to do to solve this problem, in fact, I don't know what is my problem. I had the same problem with another AIX 5.3 running the same... (2 Replies)
Discussion started by: wilder.mellotto
2 Replies

2. UNIX for Advanced & Expert Users

Causes of high runq-sz and cswch/s output from sar

Hi folks, I'm running RHEL4 (2.6.9 - 64 bit) on a 4 CPU Dual Core Xeon. This server is running DB2 database. I've been getting the following readings from sar over the past week: 09:35:01 AM cswch/s 09:40:01 AM 4774.95 09:45:01 AM 27342.76 09:50:02 AM 196015.02 09:55:01 AM... (8 Replies)
Discussion started by: fulat2k
8 Replies

3. High Performance Computing

High Performance Computing

I am interested in setting up some High Performance Computing clusters and would like to get people's views and experiences on this. I have 2 requirements: 1. Compute clusters to do fast cpu intensive computations 2. Storage clusters of parallel and extendable filesystems spread across many... (6 Replies)
Discussion started by: humbletech99
6 Replies

4. High Performance Computing

What does high performance computing mean?

Sorry, I am not really from a computer science background. But from the subject of it, does it mean something like multi processor programming? distributed computing? like using erlang? Sound like it, which excite me. I just had a 3 day crash course in erlang and "Cocurrency oriented programming"... (7 Replies)
Discussion started by: linuxpenguin
7 Replies

5. High Performance Computing

High performance Linkpack

hello everyone , Im new to HPL. i wanted to know whether High performance linpack solves linear system of equations for single precision airthmatic on LINUX. it works for double precision , so is there any HPL version which is for single precision.\ thanks . (0 Replies)
Discussion started by: rahul_viz
0 Replies

6. High Performance Computing

High Performance Linpack Compiling Issue

I'm trying to compile Linpack on a Ubuntu cluster. I'm running MPI. I've modified the following values to fit my system TOPdir MPdir LAlib CC LINKER. When compiling I get the following error: (the error is at the end, the other errors in between are because I've ran the script several times so... (0 Replies)
Discussion started by: JPJPJPJP
0 Replies
All times are GMT -4. The time now is 02:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy