Thank you all once again for all of your advice. It is very helpful. In response to shockneck, here is the requested information. This was taken during another busy period.
Please let me know what other information I should post that will help to diagnose this issue.
The values marked bold are the interesting numbers in the output: the "virtual" number is what you actually use, the "inuse" number is what you have. As long as the right number is smaller than the left as a rule of thumb everything is fine. Once the right number is (much) bigger than the left one you have to get more memory. The necessary amount is (again - rule of thumb, this is no exact algorithm) the difference then. Not that all these numbers are 4k-pages, you will have to multiply by 4096 to get bytes.
In your case the machine has 4GB of RAM and uses roughly 1GB.
Quote:
Originally Posted by jhall
If i remember correctly these are all defaults, but to me it does not look like you have a memory problem.
Could you please tell us what the machine is doing? What are the processes running on it, are there some standard applications on it?
[...]
Please let me know what other information I should post that will help to diagnose this issue.
Thanks!
This vmo -a output is either incomplete or indicates that you use some very old TL. If you did not do this yet: I suggest you upgrade your AIX to the current TL 10 SP5 or SP6. The VMM differs between some ML/TL considerably. (You might be below 5.2 ML4.)
Your application uses (together with kernel and shared memory) about 860MB RAM under load. This is pretty close to your minperm setting of 20% of your total RAM. The next tuning step would be to find out whether your application causes the trouble because it cannot allocate enough memory during operation as seen in your first post. To find out change the VMM settings.
This configuration change will enable your server to use more memory for the application by reducing memory available for file caching. If it works out as expected paging into the filesystem should stop eventually. The change will be active without reboot.
You might watch the config change's effect with vmstat. If you like the server's behaviour fix the settings by issuing the command again with a leading -p. If you don't the server will start with its default settings after the next reboot.
Last edited by shockneck; 11-10-2008 at 11:35 AM..
Reason: extended range for possible solution
What kind of application(s) is/are running on the box (maybe I missed it in an earlier post)?
As pointed out already that your box has really trouble with disk I/O, so do you have any chance to separate OS from application ie. additional disks or SAN disks?
Here is the oslevel. I cannot upgrade the system yet, because I have no method to get a mksysb first. There is no tape drive or NIM server.
Here is the vmstat:
This system is running a banking application that is written in cobol. The main process that takes up resources is "runcobol"
Right now, I don't have any option to reallocate the vg's and lv's. There are no external or SAN attached drives. The system uses only the internal disks and they are are all used. I would like to recommend the purchase of a small SAN device, but I need data to back that recommendation up.
I inherited this system and am working to correct all of the issues.
Ok. So far the system looks as if it has never been tuned. Before we make any changes, best do a backup of your current values with following commands:
I'd recommend you try the vmo settings shockneck posted to get rid of the paging space ins/outs. There is not much paging but there is and this is bad, slowing your system down.
You can watch it with "vmstat 1" and hopefully there will be soon only zeros in the column for "pi" and "po".
You have a lot of different blocked buffers which can be seen from your "vmstat -vs". Use the following to get rid of them:
After all these changes best is to reboot, since the VMM options get active immediately but it would take some time until your memory is cleaned up. Also the ioo stuff will become active after remounting your filesystems etc.. so a reboot is worth it.
The ioo settings you can monitor with "vmstat -v" after the reboot, as the counters for the blocked buffers will be resetted. If you constantly repeat the "vmstat -v" every 5 mins and you get no or very slow/small increases on the blocked buffer rows, ie. those:
Btw, I guess you have most FS still jfs? You can check it with "lsfs".
Also it would be good to know if your application supports asynchronous I/O, abbrevated AIO. If you don't know, we can check it out, but that as a next step.
Is there also some database running like Oracle for example?
Data for getting new disks of any kind you have a lot! Just show them the %tm_act of the iostat you posted which is constantly about 90% often 100% related to the fact, that high frequented application and data reside on the same disks like the OS which is usually a no go for serious/professional server setup. This is kind of a no-go criteria.
Let us know if it helped anything so far.
Edit:
If you get them that far, that they were so generous to spend their own server some sort of discs, maybe they add a cd/dvd drive. Else you could export the update disc via NFS for example from some Linux PC etc. to make the update and you should update.
Here is the oslevel. I cannot upgrade the system yet, because I have no method to get a mksysb first. There is no tape drive or NIM server.
Ouch.... does that mean, that you don't have any backup of your server at all? Isn't there any other server that could export a filesystem to your server to write an mksysb onto?
Quote:
Originally Posted by jhall
Ouch again. Don't get me wrong - I am a big fan of the "never change a running system" idea but running 5.2 ML1 has disadvantages for reasons of security and (probably more interesting in your case) in virtual memory handling. With ML1 you don't have the option to let LRUD sort out how to divide the RAM between program cache and data cache.
Quote:
Originally Posted by jhall
Here is the vmstat:
Looks reasonable for your configuration. You could increase the number of numfsbufs if you can afford to give more memory to it but you would need to unmount/mount the FS to take this change get into effect.
Quote:
Originally Posted by jhall
This system is running a banking application that is written in cobol. [...]I don't have any option to reallocate the vg's and lv's. There are no external or SAN attached drives. The system uses only the internal disks and they are are all used. I would like to recommend the purchase of a small SAN device, but I need data to back that recommendation up.
I inherited this system and am working to correct all of the issues.[...]
Maybe I can give you some hints which arguments to bring forward.
Is the server's response time meeting the SLA? The guys in charge are often completely ignorant to technical details but react very quickly once a breach of a contract is imminent or as taken place already. On the other hand if your boss says "so what?" (preferably in an email) lean back and relax. Everything is fine then. (Don't forget to tell them, that you cannot restore a backup easily. )
Technically: if this banking application is transaction based: how many transactions per second do take place? From that number you might be able to calculate how many disks your app should use (rule of thumb: a disk with 10Krpm handles about 125 IO per second (tps in iostat).) Organising disks in RAID can influence the number of possible IOPS. Use filemon to find out the applications disk's response time. Writes should not take longer than 5ms, reads no longer than 10ms.
If this banking application is database based: Can old data be archived to keep the DB indexes small? Are data and log filesystems on separate disks?
I have a IBM Power9 server coupled with a NVMe StorWize V7000 GEN3 storage, doing some benchmarks and noticing that single thread I/O (80% Read / 20% Write, common OLTP I/O profile) seems slow.
./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s ... (8 Replies)
Hi,
I'm supposed to capture many performance stats on AIX 6 and stuck up with below:
Priority queue
Disk cache hit%
Page out rate
Swap out rate
Memory queue
I see vmstatis helpful for "page out" but not sure how to get the "rate".
Could anyone please let me know how to get these... (4 Replies)
Hello,
I encounter some performance issues on my AIX 5.3 server running in a LPAR on a P520. How do I investigate performance issues in AIX. Is there any kind of procedure that takes me to the steps to investigate my server and find the sub systems that is causing the issues?
The performance... (1 Reply)
Hi,
I would like to hear your thoughts about this. We are running our Data warehouse on DB2 DPF (partition environment) and I have notice that sometimes we hit the Asynchronous-I/O-Processes peak. DB2 relies heavily on Asynchronous I/O so I would believe this has an negative impact.We are... (10 Replies)
Hello
I am new user of AIX; I have only basic knowledge of the UNIX commands, and I want to create script that will monitor the performance and resources usage on AIX 6.1 machine.
Basically I wan to start a loop that will grab, every 10 seconds, the CPU usage, the memory usage, the disk usage,... (1 Reply)
Hi Guys,
This is the situation I am in. Provide your views and input where should I start?
I have one P7 test server and a p520 production server. the job is taking pretty long on the P7 test server when compared to the P5 production server. below is the full detail.
Informix... (5 Replies)
Gurus, i have process that runs 5 times a day.
it runs normally (takes about 1 hour) to complete in 3 runs
but it is takes about ( 3 hrs to complete) two times
So i need to figure out why it takes significanlty high time during
those 2 runs.
The process is a shell script that connect to... (2 Replies)
I'm doing performance testing for one application which works on AIX.
But I don't know which performance parameters of memory need to be collected. Now, I just know very few:
1. page in
2. page out
3. fre
They are all collected by "vmstat" command.
I want to know, except for above... (2 Replies)
Hiya all,
I am a newbie sysadmin to AIX, i have worked on HPUX for 3 years.
I have started a new role with in an IBM house and because there is me and one other there are a couple of issues I cannot work out:
We havehad a production server slowing down processing batch jbs over the past... (6 Replies)