The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
NFS performance piooooter Filesystems, Disks and Memory 2 03-23-2008 02:25 AM
sed performance f3k UNIX for Advanced & Expert Users 7 03-12-2008 04:38 AM
Announcing collectl - new performance linux performance monitor MarkSeger News, Links, Events and Announcements 0 10-26-2007 03:14 PM
I/O performance gfhgfnhhn UNIX for Advanced & Expert Users 2 09-10-2006 10:10 AM
Performance of rsh jhansrod Shell Programming and Scripting 1 06-13-2005 11:29 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1  
Old 04-26-2006
Registered User
 

Join Date: Apr 2006
Posts: 37
AIX performance

Hiya all,

I am a newbie sysadmin to AIX, i have worked on HPUX for 3 years.

I have started a new role with in an IBM house and because there is me and one other there are a couple of issues I cannot work out:

We havehad a production server slowing down processing batch jbs over the past few nights - I have checked many things such as nmon stats, vmstats top procs and general performance from the machine.

We are getting hit by hit wait times in vmstat throughout the evening and know of certain jobs that run (business critical jobs) These are runniing up to 4hours longer.

Can you tell me the best way to monitor jobs / process etc so I can tell the "BOSS" what is causing the issues.

The main problem is that developers run queries to DB's on the server which we are currently running through a process to stop this.

any info will be helpful

Thanks in advance
Reply With Quote
Forum Sponsor
  #2  
Old 04-26-2006
Registered User
 

Join Date: Apr 2006
Posts: 5
Use filemon. You can see then if there is a bottleneck in your disks or I/O somewhere. IMO it's probably the best tool there is for checking that, but you have to use it while the slow down is occuring.
Reply With Quote
  #3  
Old 04-27-2006
Registered User
 

Join Date: Nov 2002
Location: Singapore
Posts: 128
My guess is the IO subsystem.
Did you check on the IO average service time and average wait time during peak hour?
Reply With Quote
  #4  
Old 04-27-2006
Registered User
 

Join Date: Mar 2006
Posts: 106
here...

it is difficult to help minus details but try that:
1. if use SSA see that volume group is in a good health and you have no stale physical volumes, run defragmentation !
2. see system defs for maximum number of open files for a process and buffer limits for a process.
3. see in top what are the processes occupy most of the time, then in lsof figure out what takes it and then in iostat or vmstat see how the picture changes as you go trough steps 1 and 2.

Last edited by amro1; 04-27-2006 at 08:15 AM.
Reply With Quote
  #5  
Old 04-27-2006
Registered User
 

Join Date: Apr 2006
Posts: 37
Hi Thanks for the responses:

yes we have checked various subsystems during the issues - we have nmon graphs that show high wait times and also have alerting that proved wait times to be above 60 from the vmstat command.

I ran svmon:

--> svmon -G -i 2

size inuse free pin virtual
memory 3145689 3087096 58593 182176 858018
pg space 2785280 428652

work pers clnt
pin 182158 0 0
in use 913993 2173103 0


also topas and noticed lots of page faults due to paging in and out.

hdisk1 and 0 are heavily utilised pretty much all day as well as other system disks but I tend not to believe everything in topas.

We run ps awux > /tmp/monitoring.date.

this file is updated every 15 minutes and I find the following which apperently is normal system calls: (these are the top procs in the file every 15 mins)

root 2064 8.5 0.0 12 9008 - A 24 Feb 60881:40 kproc
root 1806 8.5 0.0 12 9008 - A 24 Feb 60821:30 kproc
root 1548 8.5 0.0 12 9008 - A 24 Feb 60818:49 kproc
root 1290 8.5 0.0 12 9008 - A 24 Feb 60703:20 kproc
root 2322 8.5 0.0 12 9008 - A 24 Feb 60685:25 kproc
root 1032 8.5 0.0 12 9008 - A 24 Feb 60554:57 kproc
root 774 8.4 0.0 12 9008 - A 24 Feb 60152:51 kproc
root 516 8.1 0.0 12 9008 - A 24 Feb 57866:27 kproc
root 3096 0.0 0.0 64 9052 - A 24 Feb 198:45 kproc
root 2580 0.0 0.0 12 9004 - A 24 Feb 139:47 kproc
root 2838 0.0 0.0 16 9012 - A 24 Feb 1:41 kproc
root 3354 0.0 0.0 16 9012 - A 24 Feb 1:10 kproc
root 32510 0.0 0.0 16 9012 - A 24 Feb 0:03 kproc
root 30446 0.0 0.0 16 9012 - A 24 Feb 0:02 kproc
root 582168 0.0 0.0 16 9004 - A 28 Feb 0:00 kproc
root 25284 0.0 0.0 16 9004 - A 24 Feb 0:00 kproc
root 25542 0.0 0.0 16 9004 - A 24 Feb 0:00 kproc
root 25800 0.0 0.0 16 9004 - A 24 Feb 0:00 kproc
root 25026 0.0 0.0 16 9004 - A 24 Feb 0:00 kproc


when I grep out defunct:

retail ps auwx Monitor on Mon 24 Apr 18:15:00 2006
rt07mszw 1228822 Z 0:00 <defunct>
rt05hdzw 925824 Z 0:00 <defunct>
rt0v9rzm 1064108 Z 0:00 <defunct>
rt0a5jzm 1990444 Z 0:00 <defunct>
rt07mszw 1772756 Z 0:00 <defunct>
rt07mszw 1733018 Z 0:00 <defunct>
rt06ggxp 1731806 Z 0:00 <defunct>
informix 246550 Z 0:00 <defunct>
rt07mszw 781804 Z 0:00 <defunct>
rt08cazm 807862 Z 0:00 <defunct>
informix 732496 Z 0:00 <defunct>
informix 671516 Z 0:00 <defunct>
retail ps auwx Monitor on Mon 24 Apr 18:30:01 2006
rt050azb 1280306 Z 0:00 <defunct>
informix 1502640 Z 0:00 <defunct>
rt0d5rws 1481808 Z 0:00 <defunct>
rt0j2czb 1410630 Z 0:00 <defunct>
rt0o5ayb 1410304 Z 0:00 <defunct>
rt0r5mza 1030858 Z 0:00 <defunct>
rt0o5ayb 1014478 Z 0:00 <defunct>
root 1914084 Z 0:00 <defunct>
root 1966324 Z 0:00 <defunct>
rt095req 1948512 Z 0:00 <defunct>
rt01mszm 1944508 Z 0:00 <defunct>
rt0d5rws 1682574 Z 0:00 <defunct>
root 455384 Z 0:00 <defunct>
informix 232872 Z 0:00 <defunct>
informix 732496 Z 0:00 <defunct>
informix 734412 Z 0:00 <defunct>
rt05adyk 551914 Z 0:00 <defunct>
rt0a2gzt 654196 Z 0:00 <defunct>


now these do disapear and repear with different PIDS.

any ideas?

Thanks
Reply With Quote
  #6  
Old 04-27-2006
Registered User
 

Join Date: Apr 2006
Posts: 5
If you are getting a lot of paging during this time check to see if your paging space is setup correctly as well.

A few questions you might bring up -
When was the last time the box was rebooted? If you have any memory leaks this will clean that up.

Has the number of apps increased on the box since it was bought? Does it need an actual memory upgrade?

Check performance and tuning guide in relation to what the vendor recommends.

I still recommend running filemon to see if you have a disk bottleneck. Your paging can increase if there is a bottleneck and writes are taking longer and longer to compelte. If so, you would need to move around your LV's in order to increase performance.
Reply With Quote
  #7  
Old 04-27-2006
Registered User
 

Join Date: Apr 2006
Posts: 37
anyone give me some info from this output from top:

PID USER PRI NICE SIZE RES PFLTS STAT USER/SYSTIME CPU% COMMAND
0 root 0 -20 12k 8920k 0.0 non 0:00/ 6:43:22 99.7/ 0.4


it appears periodically in TOP and sometimes display's more than 1 process.

Thanks
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 11:18 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0