High CPU Usage , users affected


 
Thread Tools Search this Thread
Operating Systems AIX High CPU Usage , users affected
# 1  
Old 02-03-2014
IBM High CPU Usage , users affected

Dear All,

One production Server is affected by high usage.
Application is slow now. Please guide me how to solve it?
NMON Report shows full cpu usage.

Here im posting some server details.
Code:
bash-3.2# lparstat -i
Node Name                                  : *********
Partition Name                             : OBIEE App Server 1
Partition Number                           : 7
Type                                       : Shared-SMT-4
Mode                                       : Capped
Entitled Capacity                          : 1.30
Partition Group-ID                         : 32775
Shared Pool ID                             : 0
Online Virtual CPUs                        : 2
Maximum Virtual CPUs                       : 3
Minimum Virtual CPUs                       : 1
Online Memory                              : 21504 MB
Maximum Memory                             : 30720 MB
Minimum Memory                             : 10240 MB
Variable Capacity Weight                   : 0
Minimum Capacity                           : 1.00
Maximum Capacity                           : 3.00
Capacity Increment                         : 0.01
Maximum Physical CPUs in system            : 16
Active Physical CPUs in system             : 16
Active CPUs in Pool                        : 16
Shared Physical CPUs in system             : 16
Maximum Capacity of Pool                   : 1600
Entitled Capacity of Pool                  : 1460
Unallocated Capacity                       : 0.00
Physical CPU Percentage                    : 65.00%
Unallocated Weight                         : 0
Memory Mode                                : Dedicated
Total I/O Memory Entitlement               : -
Variable Memory Capacity Weight            : -
Memory Pool ID                             : -
Physical Memory in the Pool                : -
Hypervisor Page Size                       : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement         : -
Memory Group ID of LPAR                    : -
Desired Virtual CPUs                       : 2
Desired Memory                             : 21504 MB
Desired Variable Capacity Weight           : 0
Desired Capacity                           : 1.30
Target Memory Expansion Factor             : -
Target Memory Expansion Size               : -
Power Saving Mode                          : Disabled


Please help me out.


Can i DLPAR temporally ? Is there any temporary fix?


Thanks,
Sharath

Last edited by Franklin52; 02-03-2014 at 03:52 AM.. Reason: Replace icode tags with code tags
# 2  
Old 02-03-2014
to be honest there isn't any info in there =).

Can you send the output of this:
Code:
ps aux | head -1; ps aux | sort -rn +2 | head -10

The PID on the top cross reference that with this command:
Code:
ps -ef|grep <PID>

have you checked out
Code:
topas

This will show you what application/process is taking up the CPU usage. But based on the lpar stats there is nothing to tell you what is causing the issue.

You may way to also send the vmstat:

Code:
vmstat -Iwt 2

# 3  
Old 02-04-2014
Hi ,
Thanks for your reply techy, I am not experienced in monitoring stuffs, ill try to post your required details while causing today.

Actually that was an impact when user are high during business hours.
It was normal now. OBIEE app is running in this server.
I checked the Process, disk IO, network traffic at that time. I suspect only the nqsserve (BIP owner ) process consuming more usage.
I only have this output which i executed yesterday,

Yesterday During Business Hours
----------------------------------
Code:
bash-3.2# sar -u -P ALL 5 2

AIX PRDBIAPP1 1 6 00F7B1B64C00    02/03/14

System configuration: lcpu=8 ent=1.30 mode=Capped

11:55:43 cpu    %usr    %sys    %wio   %idle   physc   %entc
11:55:48  0       79      16       3       2    0.21    16.2
          1       66       5       2      27    0.12     9.0
          2       45       3       0      52    0.08     5.9
          3       32       4       0      65    0.07     5.1
          4       87      10       2       1    0.37    28.2
          5       42       4       0      54    0.13     9.7
          6       35       3       0      62    0.11     8.6
          7       26       3       1      71    0.10     7.8
          U        -       -       1       9    0.12     9.4
          -       56       7       2      34    1.18    90.6
11:55:53  0       65      18      12       5    0.16    12.2
          1       86       3       2      10    0.18    13.8
          2        0       3       0      96    0.05     3.8
          3        0       4       0      96    0.05     3.8
          4       86      11       0       3    0.42    32.3
          5       40       5       0      54    0.14    11.0
          6        0       2       0      98    0.09     7.0
          7        0       2       0      98    0.09     7.0
          U        -       -       1       8    0.12     9.1
          -       52       7       3      38    1.18    90.9

Average   0       73      17       7       4    0.18    14.2
          1       78       4       2      16    0.15    11.4
          2       28       3       0      69    0.06     4.8
          3       18       4       0      78    0.06     4.5
          4       86      11       1       2    0.39    30.2
          5       41       5       0      54    0.13    10.4
          6       19       2       0      78    0.10     7.8
          7       14       2       0      84    0.10     7.4
          U        -       -       1       9    0.12     9.3
          -       54       7       2      36    1.18    90.7
bash-3.2#

Present - Business hour
-----------------------------
Code:
bash-3.2# sar -u -P ALL 5 2

AIX PRDBIAPP1 1 6 00F7B1B64C00    02/04/14

System configuration: lcpu=8 ent=1.30 mode=Capped

09:37:16 cpu    %usr    %sys    %wio   %idle   physc   %entc
09:37:21  0       70      13       8       9    0.19    14.9
          1       82       9       2       6    0.25    19.2
          2        3       3       1      93    0.06     4.7
          3        2       3       0      95    0.06     4.7
          4       83      10       4       3    0.27    20.9
          5       68       8       1      23    0.17    13.1
          6        1       3       0      96    0.06     4.8
          7        1       2       0      96    0.06     4.8
          U        -       -       1      12    0.17    12.8
          -       53       7       4      36    1.13    87.2
09:37:26  0       66      14      11       8    0.18    13.7
          1       85       7       2       6    0.24    18.7
          2       55       4       3      37    0.11     8.3
          3       55       5       0      41    0.11     8.3
          4       79       9       3      10    0.23    17.8
          5       77       7       1      15    0.21    16.1
          6       18       7       0      75    0.08     6.3
          7       29       3       0      68    0.08     6.5
          U        -       -       0       4    0.05     4.2
          -       64       7       3      26    1.25    95.8

Average   0       68      13      10       9    0.19    14.3
          1       83       8       2       6    0.25    18.9
          2       36       4       3      57    0.08     6.5
          3       36       4       0      60    0.08     6.5
          4       81      10       3       6    0.25    19.4
          5       73       8       1      19    0.19    14.6
          6       11       5       0      84    0.07     5.6
          7       17       3       0      80    0.07     5.7
          U        -       -       1       8    0.11     8.5
          -       58       7       3      31    1.19    91.5

The customer wants me to increase the performance on that time, i am out of it Smilie
The mode=capped , so is this the reason its causing high cpu?

--Thanks.
# 4  
Old 02-04-2014
...delete nonsense (mixed up virtual and logical cpus) Smilie

Last edited by -=XrAy=-; 02-05-2014 at 04:23 AM.. Reason: delete
# 5  
Old 02-04-2014
It is close to it's entitled capacity (up to 95%), but did not hit the 1.3 processing units.
There could also be tuning capacity in the application. I found this here you might have a look into:
https://blogs.oracle.com/pa/entry/test
There is a link, I can not access since I have no account there anymore:
https://support.oracle.com/rs?type=doc&id=1333049.1

Check the document and see if your box is tuned as they advise. This document also exists for 10g.

Also maybe setup nmon to monitor your AIX LPARs. Will be easier to check when customer says it was slow 10 mins before he calls and you have no history.
This User Gave Thanks to zaxxon For This Post:
# 6  
Old 02-04-2014
as zaxxon said.

I've come across some oracle servers were I/O was a problem causing CPU problems.

I would high suggest as well to setup nmon reports and monitor these for a day or two, to really give you an idea of what your system is really doing. the 1.3 seems odd, and defiantly uncapped is best if allowed, but keep in mind as well if there is some config problem going on uncapping the server is going to be a pain for your other lpars.

First i'd adjust the CPU to maybe 1.6 or personally i would go for a min of 2 on a oracle server.

Ensure nmon is installed on your server and add this line to the crontab:

Code:
  /usr/bin/nmon -M -^ -f -d -T -A -s 60 -c 1435 -m <path/to/logfile>

I'd set this up and review it for I/O, CPU, Mem and ensure everything is working correctly first before uncapping.

ps. I'm sure your aware but be sure not to send the nmon file as that contains sensitive data.
This User Gave Thanks to techy1 For This Post:
# 7  
Old 02-05-2014
So, this is a partitioned server. It has an allocation of 1.3 CPUs. I'm assuming therefore that there are other partitions defined, and perhaps a little spare CPU on the chassis as a whole. If the partition is capped, then it will use up to 1.3 CPUs and no more. If it is uncapped and there is spare CPU then it will burst through the limit and you will see the value for entitled CPU on sar or vmstat exceeding 100%.

If you take the cap off the partition and other servers are busy, they will be guaranteed to get their allocated CPU as a minimum, however as already pointed out by Zaxxon, you are not CPU bound (95% entitled capacity)

Consider partitions:-
  1. 1.3 CPU shared
  2. 3.0 CPU shared
  3. 2.0 CPU dedicated
  4. 1.7 CPU shared
Server has 8 CPUs. Two are dedicated, so out of the reckoning. If the shared CPU partitions are all uncapped, then if the other two are idle, the busy one could get 6 CPUs. If all are busy, then they will be limited as shown. If partition 4 is idle and 1 & 2 are busy then they will compete for the spare CPU (after both have reached their entitled CPU limit) and you can weight them to show a preference.

You may be better just upping your CPU allocation a little then re-activating the partition (not just a reboot) else your end user will get used to having the full spare CPU available and then complain when it's in use elsewhere on the chassis. Is there another partition you could squeeze down, but take the cap off because it is rarely busy?

We have our set as 0.1 CPU, production uncapped, test/dev capped.



I hope that this helps.

Robin
Liverpool/Blackburn
UK
This User Gave Thanks to rbatte1 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Server showing high Sys% CPU usage

Hi Fellas, Not sure how I can dig in even further but we notice that one of our DB servers is showing high Sys% CPU usage even though when I execute the following command : I can see that postgres is the only one using the CPU. So if anyone can advise me what would be the best way to... (3 Replies)
Discussion started by: arizah
3 Replies

2. AIX

Process lose its parent then consume high CPU usage ...

Hello. In an informix context, on AIX 5.3 TL 12, we encounter this problem : Sometimes in the day (probably when users exits from their session), a child process lose its parent (PPID is now "1") and this child is consumming lot of CPU "USER". I tried, on different cases, "truss -p... (4 Replies)
Discussion started by: stephnane
4 Replies

3. Solaris

How can i restrict user high cpu usage?

HI In my M5000 , one of domain is having SAp installed. from today onwards we are facing some stange issue. when we start SAP application, that particular user is taking 95 % of system CP and renaming 5 % is taken by system . because of this reason application is slow. i have 4 CPU(32... (4 Replies)
Discussion started by: bentech4u
4 Replies

4. Shell Programming and Scripting

High cpu usage

I have created one script and i have added it into cron to run after 10 mins. However I have noted that whenever that script runs, It causes CPU utilization of server to increase about 10-20 % I have rechecked script and there is no way i can make changes, Script contain only 2-3 commands. So... (4 Replies)
Discussion started by: Nakul_sh
4 Replies

5. AIX

Wait time shows high CPU usage

Hi, I can't seem to make sense of this. My wait time is showing really high but vmstat's and topas are showing normal usage. ps aux USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND root 9961810 5680.7 0.0 448 384 - A Dec 16 6703072:12 wait ... (2 Replies)
Discussion started by: techy1
2 Replies

6. Linux

System Went panic after CPU usage high

Hi All, Yesterday my Linux server went panic and even a small command took a lot of time to run. When i monitored pl find the below output Cpu(s): 0.1%us, 98.4%sy, 0.0%ni, 1.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st the time spent on kernel mode is 98 % and also idle time is around 1.5 %...... (4 Replies)
Discussion started by: jegaraman
4 Replies

7. AIX

HELP!!! high CPU usage with ITM kuxagent

ibm tivoli monitor's kuxagent is taking up a lot of cpu usage. anybody has any idea? i attatched a screenshot. (3 Replies)
Discussion started by: curtis911
3 Replies

8. AIX

Script to identify high CPU usage processes

Hi Guys, I need to write a script capable of identifying when a high cpu utilitzation process. It sounds simple but we are on a AIX 5.3 environment with Virtual CPU's (VP's) and logical CPU's. Please any ideas or tips would be highly appreciated. Thanks. Harby. (6 Replies)
Discussion started by: arizah
6 Replies

9. Ubuntu

High System CPU Usage

I am running a Dell PE R815 with 4 x AMD 12 core CPUs with 128GB of RAM and a RAID 5 array of 6 SAS disks. This is an HPC application and is definitely CPU bound, however once I run 16 of these processes (thus pinning 16 cores) the work performed slows down dramatically, to maybe 5 or 10% of what... (2 Replies)
Discussion started by: mowmentous
2 Replies

10. Linux

Help pinpointing high HTTPD CPU usage in TOP

Hi, new here and need some help. Sometimes my site is extremely slow, if when there aren't too many people on, whereas when there are over 300 online members the site may be very fast. We use CentOS, PHP 5.26. The server has 4GB and Plesk usually shows about 2 or 3 GB free. I believe I can see... (4 Replies)
Discussion started by: pspace
4 Replies
Login or Register to Ask a Question