I need your help to analyze the cpu usage of our main server. I have shared below, CPU usages during busy hours and non busy hours.
CPU usage is always full at busy hours. Users always complaints about slowness. This server is a lpar partition and configured as uncapped mode.
Partitions configured as uncapped can obtain temporary access to additional processor resources without changing the entitled capacity of
this or any other partition in the system.
CPU is always busy on business hours. This means that lpar is failed to obtain the temporary access to additional processor resources since there are no additional processor resources available.
This means we have to add new cpu.
What we can be done from OS end…?
Please provide your expert thoughts.
CPU usage during non-busy hrs
CPU usage during business hours
What is %wio here - CPU waiting time for IO..?
From man page for sar - %wio
Quote:
Reports the percentage of time the processor(s) were idle during which the system had outstanding disk/NFS I/O request(s).
But i dont think so.Becaue when calculating CPU idle time in sar or vmstat, it adds the value of user,sys and wio to get the total CPU usage.
So %wio is CPU waiting time for IO..
Do you mind posting vmstat -Iwt 2 30 and iostat -Dl as a starter - and some more information about your system ?
What type of frame are you running on - how much resources are on the frame. What is your lpar doing when its busy. Is this an application- or a DB box. What type of storage are you using... which AIX version are you running - and so on.
3.6 ent cpus for 4 virtual cpus (if its p5 or p6) seems pointless to me. Even according to IBM you should give your box at least 5 virtual cpus - the way you run your box you have literally no benefit of virtualization whatsoever. And so high wait IO may point to a memory- or IO issue.
Regards
zxmaus
thanks for the response. I have attached cpu details for the problem reported servers.Since now off-business hours cpu usage is below threshold. Below is the lparstat output.
What I see is that you have awful average response times on hdisk85-90 - your 50-60 ms for writes is nowhere near being acceptable -and that your IO is very uneven distributed across disks ... maybe a simple volumegroup reorganization with maximum instead of minimum spreading across disks will bring you some performance improvement.
What I see as well is that you have lots and lots and lots of blocked IOs due to insufficient filesystem buffers (and your system needs lots of filesystem buffers as you have really significant reads that want to be buffered. I would probably start setting some general buffers. Post vmstat -v and vmstat -s outputs if you like.
For your system load, - from the data you attached to your last post, you seem to have way too many cpus entitled. As cpus are usually quite expensive, I would cut that down to maybe 1 cpu, monitor and see how your system is doing. Unfortunately that data does not really match to the data from your earlier post.
Next - your system is doing a lot of scanning and freeing when busy to make sure that the freelist contains enough free memory pages for the next IO cycle - IO needs to be cached and the more IO you have the more memory you will need to proper buffer it - OR you change the behavior of the filesystems doing the IO. Mount options like rbrw, noatime and similar can change the memory utilization significantly - so does setting oracle to filesystem_io_options (I think) to setall instead of async. If this is AIX 5.3 than you might or might not need some adjustments in async IO settings as the standard values are way too low and need to be adjusted.
actually the server has 4 virtual cpus, is entitled to use 3.6 and being p5 that means 8 threads.
To answer your question thoroughly I would like to see some vmstat -Iwt 2 10 outputs from busy timeframes (vmstat is a whole lot better with those options than any other tool I know). From the length of your runqueue I would say double the virtuals which gives you more threads but its easier to say that when the data is taken from really busy systems - and address your IO issues by proper system tuning. You seem to have totally insufficient filesystem buffering. Being on AIX 5.3 I would suggest
if you still see growing numbers than you can go up to 2048 with numfsbufs - we usually do that
Additionally for a DB box you should set AIXTHREAD_SCOPE=S in /etc/environment
your numbers in the vmstat outputs are huge - how long is your box up ?
And do you mind posting the output of lsattr -El aio0 and iostat -A
And ... be warned - closing one bottleneck in many cases opens another one - it might turn out that your box needs more memory when the IO problems are fixed.
Hi Guys,
I am a newbie on the forum. This is my first post, so first of all I would like to introduce myself.
I am a SAS Analyst programmer working for an Health Insurance client. SAS is installed on a 16 CPU AIX Server with partitions running with shared processor. I have couple of... (2 Replies)
Hello Friends,
On one of my Solaris 10 box, CPU usage shows 100% using "sar", "vmstat". However, it has 4 CPUs and prstat and glance are not showing enough processes to justify high CPU utilization.
=========================================================================
$ prstat -a
... (4 Replies)
Hi,
I want to monitor the current cpu usage, monitor usage , disk I/o and network utlization for solaris using SNMP.
I want the oids for above tasks.
can you please tell me that
Thank you (2 Replies)
how can I find cpu usage memory usage swap usage and
I want to know CPU usage above X% and contiue Y times and memory usage above X % and contiue Y times
my final destination is monitor process
logical volume usage above X % and number of Logical voluage above
can I not to... (3 Replies)
Please tell me solaris functions/api for getting following information
1- Function that tells how much memory used by current process
2- Function that tells how much memory used by all running processes
3- Function that tells how much CPU is used by current process
4- Function that tells how... (1 Reply)
when i got the cpu usage values of the all process
running in my sytem i see that 140% of the cpu is used.
(using ps aux command)
i have a 4 cpu system.
can we say that averagely 35% of each cpu is used?
and if i want to speak more precisely,
how can i find out that, which cpu is used at... (4 Replies)
how can i monitor usages of CPU, Memory, Hard disk etc. under SUN Solaries
through a c program or java program
i want to store that data into database so i can show it graphically
thanks in advance (2 Replies)
hi,
In response to your cpu usage answer
I too read sys/sysinfo.h but , if we put these values to access the repective time fields in the array pst_cpu_time which is a member of the structure pst_dynamic values doesn't seem to match, why is like this? (0 Replies)