so we frequently receive a lot of cpu related alerts. all types of checks have been created to keep an eye on the cpu but a lot of these checks make too much noise as the CPU is always being seen as high. the system and application owners say there's no issue with the cpu.
so now, i'm thinking of adding a new condition to the existing current check. this condition will add up the cpu usage of all processes found in the process table. if they're less than 50%, then, the check will never alert.
can someone please let me know if there's anything wrong with this thinking?
I expect the sum of all PCPU is related to the us column in
Code:
vmstat 2 2
(last line)
I say "related", because I don't know how AIX scales the PCPU (total? per CPU core? per logical processor?).
I also do not see how your command should sum up all PCPU if tail only shows the last 10 processes.
Then, for post-processing it is better to omit the header line
Code:
ps -e -o pcpu= -o pid= -o user= -o args=
And the floating point column is better numerically sorted with
Code:
sort -k1n
This User Gave Thanks to MadeInGermany For This Post:
C
(-f, l, and -l flags) CPU utilization of process or thread, incremented each time the system clock ticks and the process or thread is found to be running.
The value is decayed by the scheduler by dividing it by 2 once per second. For the sched_other policy, CPU utilization is used in determining process
scheduling priority. Large values indicate a CPU intensive process and result in lower process priority whereas small values indicate an
I/O intensive process and result in a more favorable priority.
man ps on AIX:
Code:
%CPU
(u and v flags) The percentage of time the process has used the CPU since the process started. The value is computed by dividing the
time the process uses the CPU by the elapsed time of the process. In a multi-processor environment, the value is further divided by the
number of available CPUs because several threads in the same process can run on different CPUs at the same time. (Because the
time base over which this data is computed varies, the sum of all %CPU fields can exceed 100%.)
I'm not sure if the above means anything to the experienced AIX users on here. but the second definition seems suggest the sum of all CPU usages of all process can be used somehow.
Ok, it is scaled, but they simply say CPUs
Old man pages often say CPUs because the authors could not imagine that a CPU might have more than one core, not to mention hyper-threading...
A typical measurement is the vmstat, where (100 - id) gives the used CPU%.
Another typical measurement is the system load, as given by the uptime command. Which is often r + b + w from the vmstat command, integrated over 1 minute, 5 minutes, 15 minutes.
What do you currently measure?
Ok, it is scaled, but they simply say CPUs
Old man pages often say CPUs because the authors could not imagine that a CPU might have more than one core, not to mention hyper-threading...
A typical measurement is the vmstat, where (100 - id) gives the used CPU%.
Another typical measurement is the system load, as given by the uptime command. Which is often r + b + w from the vmstat command, integrated over 1 minute, 5 minutes, 15 minutes.
What do you currently measure?
load is not being monitored. which is what im leaning towards. just dont know how to figure out what load number will signify if a host is being stressed.
most of the aix hosts i have here have multiple CPUs (at least 2). so does anyone have a proven calculation on how to determine if a load number is too high for a particular host?
hi,
We have two LPARs, both have same capacity and believe same configuration. ulimit settings for oracle user is unlimited for both LPARs. Installed oracle databases with same configurations on both LPARs, both databases sync every second so volume is same. Both LPARs/databases have identical... (10 Replies)
Hi,
I will be creating a process myself and I want to know the average CPU and RAM used by the process over the lifetime of the process. I see that there are various tools available(pidstat) for doing , I was wondering if it possible to do it in a single command while creation.
Thanks in... (3 Replies)
Hi,
I have a multihomed system HP-UX with two NIC cards having IP address 10.9.0.13 & 10.9.0.45
I have two weblogic servers running one listening on "10.9.0.13" and the other on "10.9.0.45"
Given a PID how is it possible to extract the IP Address that the weblogic server is using and... (1 Reply)
Hi guys,
I am currently writing a JAVA script to monitor certain unix processes through JConsole.
Upon having lots of trouble with runtime.exec, i decided to bypass the top/ps command call and just get the information straight from /proc/*pid*/whatever.
Now i can pull back any... (0 Replies)
I don't know when the process will start and end, I need write a script to trace it's cpu/memory usage when it is runing. How to write this script? (2 Replies)
Hi,
I have a shell script. But, upon execution of the same, the cpu usage is sometimes getting 100 % (checked executing top command).
At that point of time, my process hangs, doesn't run anymore. I need to kill it manually.
My concern is, is there any default method, by which I can check... (1 Reply)
I'm trying to monitor the CPU usage of a process and output that value to a file or variable. I know topas or nmon can tell me this in interactive mode but what I need is topas-looking output that allows me to write to a file after a discrete interval. Unlike nmon data collection to a file on top... (5 Replies)