|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| AIX AIX is IBM's industry-leading UNIX operating system that meets the demands of applications that businesses rely upon in today's marketplace. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
||||
|
||||
|
[SOLVED] Wait process holding CPU
Hi all, Have this performance Issue, [ Code:
srvbd1]root]/]>ps vg | head -1 ; ps vg | grep -w wait
PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
8196 - A 4448:23 0 384 384 xx 0 0 12.8 0.0 wait
53274 - A 4179:28 0 384 384 xx 0 0 12.1 0.0 wait
57372 - A 4436:05 0 384 384 xx 0 0 12.8 0.0 wait
61470 - A 4173:05 0 384 384 xx 0 0 12.0 0.0 wait
[srvbd1]root]/]>ps -ef | grep 8196| grep -v grep
[srvbd1]root]/]>There are 4 "wait" commands and it occupies like 50 % of CPU, as showed by ps aux Code:
[srvbd1]root]/]>ps aux | head -1; ps aux | sort -rn +2 | head -5 USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND root 57372 12.8 0.0 384 384 - A Feb 20 4437:22 wait root 8196 12.8 0.0 384 384 - A Feb 20 4449:41 wait root 53274 12.1 0.0 384 384 - A Feb 20 4180:41 wait root 61470 12.0 0.0 384 384 - A Feb 20 4174:17 wait fin102 299090 0.2 0.0 1992 1976 - A 09:19:01 0:42 /u02/F10204/UBS/ [srvbd1]root]/]> Please help me killing these wait process, as they are not real processes. Help would be greatly appreciated. Server performance is very poor, even login takes hell lotta time.
Last edited by bakunin; 02-26-2013 at 11:38 AM.. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
I see no "performance issue", just a "ps"-output. To assess the performance situation of your system it would be necessary to the output of: Code:
vmstat -v vmstat -tw 1 svmon -G iostat 5 no -a and, depending on the configuration of your system ("lscfg") probably some other. Anyways, to kill the processes is easy. You see the columns labeled PID in your output: Code:
kill -15 <pid> then wait a few seconds, issue another "ps". If <pid> isn't gone: Code:
kill -9 <pid> I still have serious doubts that this will help your situation any and i fear it might make you situation even worse, but there you go. My recommendation is not to do it, but you are free to do as you please. I hope this helps. bakunin |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Wait process holding CPU
Hi Bakumin, Thanks for your reply. Let me explain the issue with me right now. The server is completely empty, but still any application i start like WAS 'or' enterprise application is very slow like takes hours together. Even putty login takes like few minutes to login. So we analyzed and found only this wait process looked like bottlenect. But i m not sure, this being kernel process, i m not able to kill them. Here i post the required details, please do review and let me know if you can find any reason for the server behaviour. Code:
[srvbd1]root]/]>proctree 8196
[srvbd1]root]/]> kill -15 8196
kill: 8196: 0403-003 The specified process does not exist.
[srvbd1]root]/]>ps -fk | grep wait
root 8196 0 0 Feb 20 - 4479:28 wait
root 53274 0 0 Feb 20 - 4208:33 wait
root 57372 0 0 Feb 20 - 4466:54 wait
root 61470 0 0 Feb 20 - 4201:55 wait
[srvbd1]root]/]>vmstat -v
2035712 memory pages
1957145 lruable pages
1052819 free pages
1 memory pools
384893 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
13.3 numperm percentage
260427 file pages
0.0 compressed percentage
0 compressed pages
13.2 numclient percentage
80.0 maxclient percentage
260187 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
2228 filesystem I/Os blocked with no fsbuf
1019 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no fsbuf
0 Virtualized Partition Memory Page Faults
0.00 Time resolving virtualized partition memory page faults
[srvbd1]root]/]>vmstat -tw 1
System configuration: lcpu=4 mem=7952MB
kthr memory page faults cpu time
------- --------------------- ------------------------------------ ------------------ ----------- --------
r b avm fre re pi po fr sr cy in sy cs us sy id wa hr mi se
0 0 702600 1052811 0 0 0 0 0 0 2 6268 7339 0 1 99 0 11:52:31
0 0 702602 1052809 0 0 0 0 0 0 4 5902 7045 0 1 99 0 11:52:32
0 0 702602 1052809 0 0 0 0 0 0 5 5991 6883 0 1 99 0 11:52:33
0 0 702602 1052809 0 0 0 0 0 0 4 5913 6100 0 1 99 0 11:52:34
[srvbd1]root]/]>
[srvbd1]root]/]>
[srvbd1]root]/]>svmon -G
size inuse free pin virtual
memory 2035712 982932 1052780 384894 702631
pg space 2097152 2404
work pers clnt other
pin 314839 0 0 70055
in use 702631 240 280061
PageSize PoolSize inuse pgsp pin virtual
s 4 KB - 935236 2404 361214 654935
m 64 KB - 2981 0 1480 2981
[srvbd1]root]/]>iostat 5
System configuration: lcpu=4 drives=3 paths=2 vdisks=0
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 11.6 0.3 0.7 98.9 0.2
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 2.0 3.2 0.4 0 16
hdisk1 2.0 6.4 0.8 0 32
cd0 0.0 0.0 0.0 0 0
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 77.6 0.3 1.5 97.9 0.3
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 0.2 11.0 2.4 0 56
hdisk1 0.2 7.9 1.2 0 40
cd0 0.0 0.0 0.0 0 0
[srvbd1]root]/]>
[srvbd1]root]/]>no -a
arpqsize = 12
arpt_killc = 20
arptab_bsiz = 7
arptab_nb = 149
bcastping = 0
clean_partial_conns = 1
delayack = 0
delayackports = {}
dgd_packets_lost = 3
dgd_ping_time = 5
dgd_retry_time = 5
directed_broadcast = 0
extendednetstats = 0
fasttimo = 200
icmp6_errmsg_rate = 10
icmpaddressmask = 0
ie5_old_multicast_mapping = 0
ifsize = 256
inet_stack_size = 16
ip6_defttl = 64
ip6_prune = 1
ip6forwarding = 0
ip6srcrouteforward = 1
ip_ifdelete_notify = 0
ip_nfrag = 200
ipforwarding = 0
ipfragttl = 2
ipignoreredirects = 0
ipqmaxlen = 100
ipsendredirects = 1
ipsrcrouteforward = 1
ipsrcrouterecv = 0
ipsrcroutesend = 1
llsleep_timeout = 3
lo_perf = 1
lowthresh = 90
main_if6 = 0
main_site6 = 0
maxnip6q = 20
maxttl = 255
medthresh = 95
mpr_policy = 1
multi_homed = 1
nbc_limit = 1017856
nbc_max_cache = 131072
nbc_min_cache = 1
nbc_ofile_hashsz = 12841
nbc_pseg = 0
nbc_pseg_limit = 2035712
ndd_event_name = {all}
ndd_event_tracing = 0
ndp_mmaxtries = 3
ndp_umaxtries = 3
ndpqsize = 50
ndpt_down = 3
ndpt_keep = 120
ndpt_probe = 5
ndpt_reachable = 30
ndpt_retrans = 1
net_buf_size = {all}
net_buf_type = {all}
net_malloc_police = 0
nonlocsrcroute = 0
nstrpush = 8
passive_dgd = 0
pmtu_default_age = 10
pmtu_expire = 10
pmtu_rediscover_interval = 30
psebufcalls = 20
psecache = 1
pseintrstack = 24576
psetimers = 20
rfc1122addrchk = 0
rfc1323 = 1
rfc2414 = 1
route_expire = 1
routerevalidate = 0
rto_high = 64
rto_length = 13
rto_limit = 7
rto_low = 1
sack = 0
sb_max = 1048576
send_file_duration = 300
site6_index = 0
sockthresh = 85
sodebug = 0
sodebug_env = 0
somaxconn = 1024
strctlsz = 1024
strmsgsz = 0
strthresh = 85
strturncnt = 15
subnetsarelocal = 1
tcp_bad_port_limit = 0
tcp_ecn = 0
tcp_ephemeral_high = 65535
tcp_ephemeral_low = 32768
tcp_finwait2 = 1200
tcp_icmpsecure = 0
tcp_init_window = 0
tcp_inpcb_hashtab_siz = 24499
tcp_keepcnt = 8
tcp_keepidle = 14400
tcp_keepinit = 150
tcp_keepintvl = 150
tcp_limited_transmit = 1
tcp_low_rto = 0
tcp_maxburst = 0
tcp_mssdflt = 1460
tcp_nagle_limit = 65535
tcp_nagleoverride = 0
tcp_ndebug = 100
tcp_newreno = 1
tcp_nodelayack = 0
tcp_pmtu_discover = 1
tcp_recvspace = 16384
tcp_sendspace = 262144
tcp_tcpsecure = 0
tcp_timewait = 1
tcp_ttl = 60
tcprexmtthresh = 3
thewall = 4071424
timer_wheel_tick = 0
udp_bad_port_limit = 0
udp_ephemeral_high = 65535
udp_ephemeral_low = 32768
udp_inpcb_hashtab_siz = 24499
udp_pmtu_discover = 1
udp_recvspace = 42080
udp_sendspace = 9216
udp_ttl = 30
udpcksum = 1
use_isno = 1
use_sndbufpool = 1
[srvbd1]root]/]>lscfg
INSTALLED RESOURCE LIST
The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
* = Diagnostic support not available.
Model Architecture: chrp
Model Implementation: Multiple Processor, PCI bus
+ sys0 System Object
+ sysplanar0 System Planar
* vio0 Virtual I/O Bus
* vsa0 U789F.001.AAA8080-P1-T3 LPAR Virtual Serial Adapter
* vty0 U789F.001.AAA8080-P1-T3-L0 Asynchronous Terminal
* pci2 U789F.001.AAA8080-P1 PCI Bus
* pci1 U789F.001.AAA8080-P1 PCI Bus
+ fcs0 U789F.001.AAA8080-P1-C13-C1-T1 FC Adapter
* fscsi0 U789F.001.AAA8080-P1-C13-C1-T1 FC SCSI I/O Controller Protocol Device
* fcnet0 U789F.001.AAA8080-P1-C13-C1-T1 Fibre Channel Network Protocol Device
+ fcs1 U789F.001.AAA8080-P1-C13-C1-T2 FC Adapter
* fscsi1 U789F.001.AAA8080-P1-C13-C1-T2 FC SCSI I/O Controller Protocol Device
* fcnet1 U789F.001.AAA8080-P1-C13-C1-T2 Fibre Channel Network Protocol Device
* pci0 U789F.001.AAA8080-P1 PCI Bus
* pci3 U789F.001.AAA8080-P1 PCI Bus
+ ent0 U789F.001.AAA8080-P1-T1 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
+ ent1 U789F.001.AAA8080-P1-T2 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
* pci4 U789F.001.AAA8080-P1 PCI Bus
+ usbhc0 U789F.001.AAA8080-P1 USB Host Controller (33103500)
+ usbhc1 U789F.001.AAA8080-P1 USB Host Controller (33103500)
* pci5 U789F.001.AAA8080-P1 PCI Bus
* ide0 U789F.001.AAA8080-P1-T10 ATA/IDE Controller Device
+ cd0 U789F.001.AAA8080-P1-D3 IDE DVD-RAM Drive
* pci6 U789F.001.AAA8080-P1 PCI Bus
+ sisscsia0 U789F.001.AAA8080-P1 PCI-X Dual Channel Ultra320 SCSI Adapter
+ scsi0 U789F.001.AAA8080-P1-T5 PCI-X Dual Channel Ultra320 SCSI Adapter bus
+ scsi1 U789F.001.AAA8080-P1-T9 PCI-X Dual Channel Ultra320 SCSI Adapter bus
+ hdisk0 U789F.001.AAA8080-P1-T9-L5-L0 16 Bit LVD SCSI Disk Drive (73400 MB)
+ hdisk1 U789F.001.AAA8080-P1-T9-L8-L0 16 Bit LVD SCSI Disk Drive (73400 MB)
+ ses0 U789F.001.AAA8080-P1-T9-L15-L0 SCSI Enclosure Services Device
+ L2cache0 L2 Cache
+ mem0 Memory
+ proc0 Processor
+ proc2 Processor
[srvbd1]root]/]>kill -9 8196
kill: 8196: 0403-003 The specified process does not exist.
[srvbd1]root]/]> |
|
#4
|
||||
|
||||
|
These are kernel wait processes. They are absolute normal and come with the OS, 1 per Logical CPU. As one can see you have 2 procs and I assume you have SMT activated with 2 Logical CPUs per virtual or physical CPU.
As Bakunin said, you should really not kill them. They are definetly not your problem. They are just waiting for work and help calculating your idle percentage. Leave them alone! IBM CPU Utilization for the wait KPROC - United States Either this box is very weak ressource wise, the application is programmed badly, or there is some other kind of performance problem. There can be problems with name resolution etc. whatever. Start up the application and have something like vmstat -w 2 20 while it performs slow, to get a 1st impression of your system. Also check the logs of your application, if it writes any. |
| The Following User Says Thank You to zaxxon For This Useful Post: | ||
gopeezere (03-06-2013) | ||
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Quote:
Quote:
Quote:
You might want to tune your maxperm- and minperm-settings to more sensible values. What these values might be depends on the application, but 95% and 3% are good starting points. Right now you have: Code:
[srvbd1]root]/]>vmstat -v
[...]
20.0 minperm percentage
80.0 maxperm percentage
80.0 maxclient percentage
[...]Code:
[srvbd1]root]/]>svmon -G
size inuse free pin virtual
memory 2035712 982932 1052780 384894 702631
pg space 2097152 2404This display is in memory pages (=4k). 2 Mio pages ~ 8GB. From these 2 mio pages 700k have been used, the rest is simply doing nothing. If this is everything your system ever does you could reduce its memory to ~4GB and everything would be fine. Code:
[srvbd1]root]/]>iostat 5
System configuration: lcpu=4 drives=3 paths=2 vdisks=0
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 11.6 0.3 0.7 98.9 0.2
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 2.0 3.2 0.4 0 16
hdisk1 2.0 6.4 0.8 0 32
cd0 0.0 0.0 0.0 0 0
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 77.6 0.3 1.5 97.9 0.3
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk0 0.2 11.0 2.4 0 56
hdisk1 0.2 7.9 1.2 0 40
cd0 0.0 0.0 0.0 0 0These disks are doing absolutely nothing. The little activity residue is the system itself idling away. It is the computer equivalent of one twiddling his thumbs. Code:
[srvbd1]root]/]>no -a Looks like everything is at defaults here. Once the system will actually do anything there might be a reason to optimize a bit, but now just leave it alone. I wonder what you want with the many adapters - you have no disks (save for the two system disks) right now. Summary: It seems that the system is built right now and some of the hardware ins't even connected (like disks). The system is definitely not the problem when a "putty" eds "several minutes" to connect. I'd look at the network (routers, firewalls, VLANs, etc.) and network-related services (DNS, NIS, maybe kerberos or LDAP, etc.) if the culprit is there. My first guess would be the name server, then the other components i named. I hope this helps. bakunin |
| The Following User Says Thank You to bakunin For This Useful Post: | ||
gopeezere (03-06-2013) | ||
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Wait process holding CPU
Thanks for your detailed Analysis Bakumin & zaxxon.
As exclaimed, yes the system was doing nothing at that point in time, they were completely idle. I was either trying to login to sqlplus from other session and it was taking 2 minutes for that 'or' may be some other things very general like bring up a small service. I m going to try all these suggessions given. Will let you know guys. Thanks a ton for your help. |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
another way to testif delays are caused bt name server lookups is to edit /etc/netsvc.conf.
add or edit a line so that it says, hosts=local4 fyi my normal setting is: hosts=local4,bind4 as I am not using any IP6. |
| The Following User Says Thank You to MichaelFelt For This Useful Post: | ||
gopeezere (03-06-2013) | ||
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to check which process is holding up the ilde port | Whiteboard | Solaris | 5 | 07-12-2010 03:29 AM |
| How to make the parent process to wait for the child process | sennidurai | Shell Programming and Scripting | 7 | 09-30-2009 03:32 AM |
| %wait nmon CPU-UTILISATION | tagger | AIX | 1 | 12-23-2008 10:14 AM |
| wait command - cat it wait for not-chile process? | alex_5161 | Shell Programming and Scripting | 2 | 06-26-2008 06:14 PM |
| 86% CPU for wait | big123456 | UNIX for Advanced & Expert Users | 3 | 11-04-2005 09:07 AM |
|
|