Understanding & Monitoring CPU performance (Load vs SAR)


 
Thread Tools Search this Thread
Operating Systems Solaris Understanding & Monitoring CPU performance (Load vs SAR)
# 8  
Old 06-05-2016
Quote:
Originally Posted by jlliagre
Yes, Linux is well known to include uninterruptible I/O in its load average calculation.

q1) The so called 1 min load average will tend to reach 1 but if the initial load was negligible, you'll need to wait for several minutes for it to get close to 1. It will be about 0.6 instead of 1 after one minute. Reciprocally, if the initial load was higher than one, you'll need to wait long enough (and likely more than 1 minute) to get close enough to it.

q2) The load average is derived from the run queue size which is sampled at 10 ms interval. The CPU load is computed from micro-state accounting with "exact" precision (i.e. several degrees of magnitude better, in the nanosecond range). A dtrace script should allow to figure out what is the cause of the discrepancy but in any case, the CPU utilization values are accurate, the average load is a rough approximation.
Hi jlliagre,

Thanks for your reply.

For q1) Yeap, when i run a sar -q 1 60 (for 1 minute average), the run queue's average is about 1+ , but my load avg for 1 minute (using uptime) shows only about 0.13.
Reading Brendan Gregg 's load average video - seems to talk about exponential decay of the load calculation (but i am no maths expert).
Thus, i will leave it as it is -> that having a load of 1 for a minute, will require more then 1 minute to be reflected in the " 1 minute load average " .

For q2) I am still confuse about the difference between CPU load and CPU utilization.

(on a 1 cpu - no multicore, or hyperthread computer)
If i have a continuous load of 1 for 1 minute, does that means my CPU utilization is near 100% / 0% idle for that 1 minute ?

q3) You mentioned that CPU load is sampled at 10ms interval.
How about the sampling interval for CPU utilization/time ?

In a nutshell, if i have 6 core cpu (12 thread total), and i have a average load of 3 most of the time;
Can i expect my CPU utilization to be around 3/12 * 100 = 25% (when the load is 3) ?

p.s. 1 last question -> does sar -q include thread currently running in cpu or only those runnable/ready in run queue ?

Regards,
Noob

Last edited by javanoob; 06-05-2016 at 12:11 PM..
# 9  
Old 06-05-2016
Quote:
Originally Posted by javanoob
(on a 1 cpu - no multicore, or hyperthread computer)
If i have a continuous load of 1 for 1 minute, does that means my CPU utilization is near 100% / 0% idle for that 1 minute ?
It might mean that, but not necessarily. Could be also ten processes all repeating this pattern : fighting to get the CPU during 1 second then idling for 9 seconds.
Quote:
q3) You mentioned that CPU load is sampled at 10ms interval.
How about the sampling interval for CPU utilization/time ?
There is no sampling. The CPU utilization is accurately measured, not estimated.
Quote:
In a nutshell, if i have 6 core cpu (12 thread total), and i have a average load of 3 most of the time;
Can i expect my CPU utilization to be around 3/12 * 100 = 25% (when the load is 3) ?
That's only one eventuality.
Quote:
p.s. 1 last question -> does sar -q include thread currently running in cpu or only those runnable/ready in run queue ?
The latter. A running thread is not waiting in a queue.
This User Gave Thanks to jlliagre For This Post:
# 10  
Old 06-05-2016
Hi Jilliagre,

Once again, thanks for your reply and truly appreciate your time.

Quote:
It might mean that, but not necessarily. Could be also ten processes all repeating this pattern : fighting to get the CPU during 1 second then idling for 9 seconds.
q1) Do you mean the 10 processes fighting to get the CPU at the same time ?
I would then expect to see a load of 9 (in queue) + 1 (running) and have 100% CPU utilization for that 1 minute.

- Am I right ?

Quote:
Quote:
In a nutshell, if i have 6 core cpu (12 thread total), and i have a average load of 3 most of the time;
Can i expect my CPU utilization to be around 3/12 * 100 = 25% (when the load is 3) ?
That's only one eventuality.
q2) Can you elaborate on this further ? Why is it 1 ?

q3) Should hyperthreading be taken into consideration when measuring load ?
- 6 core ; point of saturation -> 6 (can take load up to 6) or
- 6 core but 12 thread; point of saturation -> 12 (can take load up to 12)

Regards,
Noob
# 11  
Old 06-06-2016
Quote:
Originally Posted by javanoob
q1) Do you mean the 10 processes fighting to get the CPU at the same time ?
I would then expect to see a load of 9 (in queue) + 1 (running) and have 100% CPU utilization for that 1 minute.

- Am I right ?
No. The load would be 9+1 during the same 1 second when all processes compete then 0 during 9 seconds when all are idling so the average load would be 1. As you have only one core, the CPU utilization would be 10%.
Quote:
q2) Can you elaborate on this further ? Why is it 1 ?
See q1.
Quote:
q3) Should hyperthreading be taken into consideration when measuring load ?
- 6 core ; point of saturation -> 6 (can take load up to 6) or
- 6 core but 12 thread; point of saturation -> 12 (can take load up to 12)
It should but an issue is depending on the kind of workload, the saturation level will vary. See for example CPU utilization of multi-threaded architectures explained (Solaris and Systems Information for ISVs)
# 12  
Old 06-06-2016
Hi Jlliagre,

Thank you so much for your reply. Really appreciate your guidance.

Quote:
Originally Posted by jlliagre
No. The load would be 9+1 during the same 1 second when all processes compete then 0 during 9 seconds when all are idling so the average load would be 1. As you have only one core, the CPU utilization would be 10%.
Please pardon me for my ignorance, but I still did not quite get the full picture or the maths behind this.

Quote:
the load will be 9 (in queue) + 1 running during the 1st second, and 0 during the next 9 seconds.
q1) Does that means that the 10 threads/load are actually completed within the 1st second (right before the 2nd second)

Quote:
so the average load would be 1.
q2) Do you mean that the load average is calculated as an average of 1 second for the past 10 seconds ?
Hence (9+1) =10 load in the past 10 seconds ? (10load/10sec)
so its essentially 1 load / per sec, for the rest of the 60 secs/1 minute ?
--but i thought you mentioned earlier that the load is sample every 10ms and not 10 sec?

Quote:
As you have only one core, the CPU utilization would be 10%
q3) My understanding is that I have 1 cpu/core.
It was utilized 100% on the 1st second of every 10 seconds.
In 1 minute, it would be utilized 5/60 second.
So the utilization for 1 minute is 5/60 * 100 = 8%.

How does it become 10% ?
Because the CPU is fully utilized for 1 sec in every 10 second = 1/10 *100 = 10% ?

Base on the above ->
Is both
a) the cpu utilization (1sec/10sec*100)
and
b) the cpu load ( (9+1)load / 10 sec) calculated per every 10second then ?

======

Please do bear with me if i seems totally off.
Hope to hear your advice soon.

Regards,
Noob

Last edited by javanoob; 06-06-2016 at 12:34 PM..
# 13  
Old 06-06-2016
Quote:
Originally Posted by javanoob
q1) Does that means that the 10 threads/load are actually completed within the 1st second (right before the 2nd second)
Yes, that's what I wrote: ten processes all repeating this pattern : fighting to get the CPU during 1 second then idling for 9 seconds.
Quote:
q2) Do you mean that the load average is calculated as an average of 1 second for the past 10 seconds ?
The load average is not really an average, there are three load average values maintained by the kernel, the 1min, the 5min and the 15min one. They are updated at worst every second.
Quote:
Hence (9+1) =10 load in the past 10 seconds ? (10load/10sec)
The load is 10 during one second and zero for the remaining 9 seconds. The kernel function that computes the load average is smoothing this to a load of 1.
Quote:
so its essentially 1 load / per sec, for the rest of the 60 secs/1 minute ?
One unit of load in average for every second of the minute, including the first one.
Quote:
--but i thought you mentioned earlier that the load is sample every 10ms and not 10 sec?
The load average is updated every second from the run queue statistics which are sampled every 10 ms. (Note that I might be wrong here, it is well possible that starting from Solaris 10, the run queue is also computed from micro-state accounting instead of being sampled. That doesn't affect what we are talking about here).
Quote:
q3) My understanding is that I have 1 cpu/core.
It was utilized 100% on the 1st second of every 10 seconds.
In 1 minute, it would be utilized 5/60 second.
So the utilization for 1 minute is 5/60 * 100 = 8%.

How does it become 10% ?
There are six periods of 10 seconds in one minute, not five, hence 6/60*100=10%

Last edited by jlliagre; 06-06-2016 at 05:15 PM..
This User Gave Thanks to jlliagre For This Post:
# 14  
Old 06-07-2016
Hi Jlliagre,

So sorry for the late reply. Was having a long day today.
All aside, truly appreciate your time and explanation; having your explanation beats me googling and reading all around.

Back to the topic

Quote:
There are six periods of 10 seconds in one minute, not five, hence 6/60*100=10%
You are right . Seriously I do not know how did i ever derive that there are 5 period of 10 seconds in 1 minute -_-!

So in summary, can i make the following assumptions ->
A load of 1 in 1 minute -> might means

a) a simple scenario of an actual load of 1 every second and the CPU is 100% utilized for the whole of 1 minute

b) a load >1 in certain seconds that average out to be 1 load/second in a minute, and the ratio between load and CPU will not be a 1:1 as the CPU might be able to completed multiple threads/loads per second.

Thus
c) High load != High CPU
As shown in your example above, average load of 1 per minute (for 1 core/cpu) might have only 10% CPU utilization.

High CPU = High Load
If a CPU utilization is high, this means that the CPU time is being utilized / held by load in the system.

Regards,
Noob
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Help me Solaris 10&11 cpu load average states for 24 hours report

need to capture the following data on an hourly basis without cronjob scheduling in Solaris 5.10/5.11:- 1. load averages 2. Total no. of processes. 3. CPU state 4. Memory 5. Top 3 process details. any other third-party tool is available? (7 Replies)
Discussion started by: thoranam
7 Replies

2. UNIX for Dummies Questions & Answers

CPU load in video decoding using SAR

Hi, I'm John_giova and I'm new in this Forum. Sorry my english, it's not my first language. So, I'm trying to check the CPU utlization during the video encoding and decoding (making a comparison between SW and HW ) using the SAR tool. According to a past thread I saw as CPU utilization I should... (4 Replies)
Discussion started by: John_giova
4 Replies

3. Shell Programming and Scripting

One script for Linux Monitoring-free, sar, vmstat, mpstat

HI , I am wrirting a script for checking the performance monitoring on Linux System when my application is running. I have to run a test for 30 minutes on some server and while the test is running i have to capture the perfromance metrics of Linux through vmstat , sar, mpstat, free. here is the... (3 Replies)
Discussion started by: Anamica
3 Replies

4. AIX

Using sar to get CPU utilization for WPARs

Based on the documentation (Help - AIX 6.1 Information Center), I should be able to get the CPU utilization on a WPAR. But when I ran sar on the LPAR, I don't see the -@ option: Usage: sar { -A | } ]] ]] | ALL] ] ] I'm on AIX 6.1. ... (0 Replies)
Discussion started by: PPPP
0 Replies

5. SuSE

SUSE 11.4 sar monitoring not run by default.

Hi, I had installed sysstat package below on a SUSE 11.4 box. I can see the sysstat cron file listed under the directory /etc/sysstat. However, it looks like the sysstat sar monitor is never run at all and the daily sar files are not created under the directory /var/log/sa. The cron daemon is also... (1 Reply)
Discussion started by: devtakh
1 Replies

6. Shell Programming and Scripting

Setup of sar command for CPU measures

I receive on a daily basis CPU measures on a UNIX Server (AIX 5.3 version, korn shell). These CPU measures are provided by Omnivision tool. We could see that Omnivision daily stats are not always equal to NMON stats regarding CPU level. On my side (I work in an IT Production Support Team, not... (0 Replies)
Discussion started by: Scofield38
0 Replies

7. Shell Programming and Scripting

awk & CPU Load

Deal All, I'm writing a simple awk to generate some sort of report. The awk will check 24 files (file generated each one hour in a wholoe day) and then it will print one field to another file for counting purposes. The script is working fine but the problem is that the CPU load is very high... (10 Replies)
Discussion started by: charbel
10 Replies

8. Solaris

sar : insufficient address space to load xxxx device records

Hello, i am using Solaris 10, The sar running in my system might be corrupted, but not sure why as there has been no updates to it ( to the best of my knowledge) and it was working fine until few days back. If i try to get sar reports using sar -o <filename> 60 180 there is no error but the... (2 Replies)
Discussion started by: nimi20
2 Replies

9. UNIX for Dummies Questions & Answers

CPU utilization: sar vs ps

Any reason why the "sum of all" average cpu utilization numbers collected from ps during any given time sample are "consistently" lower than the corresponding numbers reported by sar (%usr, %sys). We have a Solaris O/S 2.8. We have been trying to correlate the CPU numbers from the sar, to the... (0 Replies)
Discussion started by: sevpert
0 Replies

10. HP-UX

sar output gives 98% idle CPU

Dear All, Our HPUX 8 GB 8CPU database server is behaving abnormally for the last 4+ weeks. I have generated a sar output and it is here- 11:46:52 %usr %sys %wio %idle 11:46:53 1 1 6 92 11:46:54 0 1 0 99 11:46:55 0 1 0... (3 Replies)
Discussion started by: Ashrunil
3 Replies
Login or Register to Ask a Question