Can you please answer my question.
i see lot of CPU utilization on AIX LPARs. i am able to find the cause of the probelm. But i do not know how to mitigate or fix the problem.
for instance,
i found the process which is consuming most of CPU. i informed the responsible team.
how exactly needs to be fixed ?
Issue:
java (websphere) jvm is consuming 96% CPU
and other server database process consuming 98% CPU
You need to collect more data and answer some basic questions. Is this really a problem? Is the system slow. How many process/threads are in the run queue? How long has the process been running, how many threads are running for the process (dbx) and what are their states. What does vmstat reveal. How long has the system been up?
Yes, this is a real problem. Many Application users reported the slowness. It is slow. 9 threads in the ABC database and only one is executing and the remaining threads are just waiting.
How long has the process been running
I'm not sure about this. Please tell me how to check this.
I do not see any issues with websphere servers now. But on Database server, i see utilixation
vmstat o/p
Last edited by Scott; 10-02-2013 at 03:53 PM..
Reason: Code tags for code blocks, not icode tags
First, have a look at the "us", "sy", "id" and "wa" columns of the "cpu" part: these are percentages, denoting the time the processors spent (on average) in the "users", "system", "idle" and "wait" parts of processing: "user" is roughly your programs, "system" is kernel activity and other system services, "idle" is when no process is running and "wait" is like idle, but with I/O operations outstanding. If you would have high "wait" percentages it would hint to a I/O-bound system, but this isn't the case here. In fact your system is busy to saturation running your application, which is as it should be. If it is too slow the only thing that helps is more processing power.
Alas the system cannot get more processors right now. The last column, "ec" is the "entitled capacity" and it is at near 100(%) too. LPARs get some share of the systems processors per default, but can be entitled to some bigger amount should the necessity arise. These additional resources are dynamically added should the system get near saturation and are dynamically relinquished once the situation gets less demanding. This system already has already allocated as much as it will ever get and this still isn't enough.
Now, lets look at the top line of the output: you have 10 logical CPUs. What a "logical CPU" comprises (some fraction of a physical CPU) depends on the physical CPU backing it and ultimately on the hard you run: POWER5? POWER6? POWER7? It might be that 10 lCPUs are a poor layout for your underlying hardware and overtax the physical CPUs with too many context switches.
Anyway, you definitely have to add CPUs to this LPAR: at the HMC modify the LPAR profile to add more (physical) CPUs as "desired" and also increase the "maximum" processors to a new sensible value. To know what a "sensible value" for "maximum" is you probably will have to monitor the system for a while, so go with a good estimation and change that after a few days. After you changed the profile you will have to reboot (cold reboot/power cycle - simple "shutdown -r" won't help) to have the new profile used.
I hope this helps.
bakunin
These 2 Users Gave Thanks to bakunin For This Post:
I understand that, i need to increase physical processors (Desired) from HMC.
Yes and no: what you first need to do is to understand your system. This means (among other things) to understand the patterns of resource consumption there are. You might have a relatively stable demand for CPU or a widely varying one. You may have predictable ups and downs (for instance: day=high, night=low, etc.) or event-triggered ones. If your consumption is varying it might be by a small factor or a big one. All these things you can only find out through careful, long-term study of the system. I know these things even less than you, because i know even less about your system. So, please, bear with me for being somewhat general in my suggestions.
Set up and run sar (or nmon or whatever else you like) to monitor consumed resources (memory, CPU, I/O, net, ...) over some time to get a good impression about these usage patterns. The tool you use doesn't matter asl long as it provides the data you are interested.
Run a ps (or top or something alike) to learn about the most demanding processes in terms of memory and CPU. Maybe they run all day, maybe they run only during a certain time of the day. Maybe they run all day but only need very much memory/CPU power during a short time. Maybe ... You see, there is a lot of things not known about your system.
Performance tuning is a very simple task once you have understood where the bottleneck is. Finding out the bottleneck, though, can be extremely difficult. I suggest you read the little tutorial i wrote to get some pointers.
Quote:
Originally Posted by System Admin 77
But i see suddenly the CPU usage went down, today it is
I know that, a particular JVM or DB process consumed lot of CPU (by ruuning topas)
But am not sure, how to tune it. (*Not sure why it went down)
How to tune Java processes or databases is beyond my area of expertise. I take them as they are and leave the tuning to the DBAs and application engineers.
However, we have now seen two situations of your system: one in which it choked under the load and onw where it is (almost) idle. Again: what you need is to find out the pattern behind it.
In general there are three values to every resource you can define in the HMC profile: "minimum", "desired" and "maximum".
"Minimum" is the minimum amount the LPAR needs to allocate, otherwise it won't start.
"Desired" is how much the LPAR grabs if that much is available. This is the normal amount an LPAR has when it starts.
"Maximum" is how much the LPAR can additionally allocate should it be necessary. This additional resources (the difference between "desired" and "max") will be allocated only during runtime.
The reason why this is done that way is that you can "overcommit" the systems resources. If you have 100GB memory installed you can create LPAR profiles worth 150 GB in total. You leave some of them unstarted and/or the last one will only start with something between "minimum" and "desired" in this case.
What you have to do now is to find sensible values for "desired" and "maximum". This, again, can only be done in monitoring the system for some time.
Quote:
Originally Posted by System Admin 77
How can we set/decide the number of Virtual CPUs in any LPAR. I mean on what basis ?
Basically, a "physical CPU" is what you know as a CPU: a processor you can touch. From one such physical CPU one or several "virtual CPUs" are created. The more virtual CPUs are created from one physical CPU the "smaller" the virtual CPUs become. You allocate a number of physical CPUs to an LPAR and state in the LPAR profile how many virtual CPUs to create from these. If you change the allocated number of processors (physical CPUs) this number of virtual CPUs will not change, they will just get more (or less) powerful.
You need one CPU to run a thread (or - the same - a single-threaded process). Still, these threads may have different demands on processing power. Choose as many cirtual CPUs to satisfy all threads and keep them as small as possible, yet as big as necessary - that is the basic idea. What exactly "necessary", "possible", etc., means: see above, monitor and find out.
About threads/processes: in the vmstat output you see "r" and "b" on the left side. If you regularily see big numbers in "r" the system might profit from a raised number of virtual CPUs, even if they are smaller than now. If there are only low numbers you might be able to reduce on the number of lCPUs. Again: not enough data right now to suggest either.
As an afterthought: when you compare the first and second vmstat output you can notice that the numbers in the run-queue ("r") were low in the first but are high in the second. That basically means: there were few but "CPU-heavy" processes running when the first snapshot was taken but many (very lightweight) processes ran during the second. It would be interesting to know which processes these were/are and if there are dependencies. If (for the last time: this is NOT a suggestion, but it might become one if the data back it up) during times of heavy taxation only few, heavy processes run the machine might profit from fewer (but more potent) lCPUs.
Thanks much for your time and analysis. Currently we've 1 physical CPU and 24G Memory
Desired /ent phyisical CPU --> 1
Number Of Processors: 5 (5 virtual CPUs ==> 10 logical CPUs)
Again i saw heavy CPU utilization . So' in my case i feel that, decreasing Vcpus is a better idea. (I will give a try, Please correct me if i am wrong)
vmstat o/p
And
sar command output
Thank you, really appreciate your time and ideas.
Last edited by System Admin 77; 10-16-2013 at 11:38 AM..
Guys,
I have a question - when nmon reports a sizeable %CPU wait, does that mean -
1) IO operations are slowing CPU down, OR
2) paging slowing the CPU down, OR
3) one cant tell??
I thought the nmon documentation clearly suggested that CPU waits reported in nmon were from disk... (4 Replies)
Hi All,
i'm try to update my aix 6100.06.05 to 6100.07.00.
i download the 4,5 GB of FixPack buy i don't have a required package (devices.chrp.pci.rte 6.1.7.0)
This package does not exist on the fix pack (i've check in the .toc file and in the .bff files)
On ibm website i see that this... (0 Replies)
Hi
Could somebody explain me how AIX is using CPU??
For example when we have 2 processors system is giving all task to one of them till 100% is used ?? Or it's depend on configuration or anything else ??
Best regards
enda (3 Replies)
We have tried to install an APAR fix IZ20298 on a AIX test server. It is requiring a base level of bos.adt.prof of 5.3.0.0 I cannot find this file anywhere. I fould 5.3.0.1 and it still will not install without the base install. Any ideas where I can find bos.adt.prof 5.3.0.0? (1 Reply)
Hi All,
I have this fix for AIX (5300-06-06-0811) and i need to install it.
How can i do this?
What are the prerequisites for this fix?
Thanks (1 Reply)
Hi,
I want to print from AIX 5.3/6.1 using 'pr' preprocessing filter and 'PCL' print file type.
Steps:
1. Smitty
2. Print Spooling
3. Create a print queue(remote->Generic)
4. change the attributes for that print queue.
5. Change print file type to PCL and... (1 Reply)
Hi,
redbook documentation is telling that IY17981 fix is required for aix 4.3.3 to aix 5L migration. But there is no mention about that fix in any ML installation packages.
- My system is ML11 :
oslevel –r
4330-11
- But xlC.rte is on wrong version :
lslpp -L xlC.rte
xlC.rte ... (3 Replies)
After install fix pack or APAR, if aix need reboot? if not, do we need stop database and all applications before we install fix pack or APAR? (3 Replies)
We are planning to move to AIX 5.3 and we would like to know if someone has had any 'bad' experiences with it.
We have a 32PE p690 Regatta and currently we are running the latest AIX 5.2 with the latest patches. Has anyone any interesting points to mention when transitioning to AXI 5.3?
Is... (1 Reply)