After run ps , uptime , w command I get reply "killed"


 
Thread Tools Search this Thread
Operating Systems AIX After run ps , uptime , w command I get reply "killed"
# 8  
Old 09-30-2009
I got the same problem on AIX 6.1.
The problem appears after two or three days of workload.
The ulimit of the users is the same as root.
Any clues?
# 9  
Old 09-30-2009
check the error log. errpt. you may have rootvg corruption or something really whacky going on.
# 10  
Old 10-02-2009
Try running a trace (with truss) on it and see if this gives any clues:

$ truss ps
# 11  
Old 10-02-2009
Sorry, but when the problem occurs, truss is killed also.
No errors in errpt or suspect of rootvg corruption.

This symptom happens after two or three days of work, with a reboot the problem is solved and the system works as normal as any other aix i got.

Can i reinstall AIX in upgrade mode when the version installed is newer (because of the fixes) than the version on the DVD?

IBM Lab says that they can see the kill signal, but they cannot identify why this happens.


Quote:
Originally Posted by garethr
Try running a trace (with truss) on it and see if this gives any clues:

$ truss ps
# 12  
Old 10-03-2009
Quote:
Originally Posted by sebaswatts
This symptom happens after two or three days of work, with a reboot the problem is solved and the system works as normal as any other aix i got.
This reminds me on a machine i once had, which ran a memory hog. The application would slowly allocate all the memory in the machine thereby filling up the swap.

When AIX has a swap utilization of more than ~96% it cannot reorganize its swap any more and the system starts to react unpredictably. One of the signs of the machine being near to hanging is that commands will become killed the way you describe. In my experience it was usually a matter of minutes before the final crash sat in.

As you say that a reboot remedies the problem temporarily this seems to fit. Probably you could try to monitor memory and swap utilization and correlate to the times the problem happens? Just an idea.

I hope this helps.

bakunin
# 13  
Old 10-04-2009
Bakunin,

if you were right, this would happen to all users, including root ? At least on my boxes, all userprocesses are impacted, when I have a rogue process eating the memory + paging area.

My best guess would be number of runnable processes that is likely unlimited for root user but for sure not for the others (there it is 2000 by default and this value is system wide set per user, no matter how many processes the user himself runs), if you have a very busy box, it can add up very fast since every oracle query forks several processes. Changing the value for testing for one of the impacted users if the problem occurs would prove me right or wrong.

My thought would go as well into the memory direction but rather pinned memory than paging - how much memory is pinned on the systems? AIX can only pin about 85% in total and the longer the kernel is up, the more memory is pinned by it. If you have already a high amount of pinnend memory right after the reboot - for example by pinning your databases into memory (what is btw real bad practice and on AIX just not required accept you are using very large pages), you can see sometimes this kind of issues without having full paging areas ...

Kind regards
zxmaus

Last edited by zxmaus; 10-04-2009 at 03:30 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to run root level command , if user has "su -" permission in sudoers provided?

I am looking t run root level command on multiple servers, but all servers have only "su - " permission available in sudoers. please help me if any way that I can run command using help of "su -" My script for hosts in `cat hosts.txt`; do echo "###########################Server Name-... (5 Replies)
Discussion started by: yash_message
5 Replies

2. Shell Programming and Scripting

How to run "finger" command in an if statement?

I have been trying to run the finger command in a if statement but its giving me a bunch of errors. gidlistTemp="g274gG;g2759C;g28320;g2885G;g2A276;g23338;g2A5h5;g2A307" for i in $(echo $gidlistTemp| tr ';' ' \n') do tst=(finger $i | wc -l) if then ... (4 Replies)
Discussion started by: ajetangay
4 Replies

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

4. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

5. UNIX for Dummies Questions & Answers

Command Character size limit in the "sh" and "bourne" shell

Hi!!.. I would like to know what is maximum character size for a command in the "sh" or "bourne" shell? Thanks in advance.. Roshan. (1 Reply)
Discussion started by: Roshan1286
1 Replies

6. Shell Programming and Scripting

can't run "date" command

A strange observation - $ ksh date ksh: date: cannot execute $ ksh "date" ksh: date: cannot execute $ ksh "date " Thu Sep 18 09:22:12 CDT 2008 why the date command doesn't run without a space ?? Please help (3 Replies)
Discussion started by: ajitkumar2
3 Replies

7. Shell Programming and Scripting

catalina.sh : need combination from "start" and "run"

heya, can someone help me with following problem. i am not sure how far you know the catalina.sh script from tomcat. when i start my tomcat with "catalina.sh run" then the startup-process-output will be printed out on the console, but the tomcat process is started in current shell/session, so... (1 Reply)
Discussion started by: Filly
1 Replies

8. Shell Programming and Scripting

why "expr "${REPLY}" : '\([1-9][[:digit:]]*\)" can figure out whether it is a digit?

I found below script to check whether the variable is a digit in ksh. ############################ #!/bin/ksh REPLY="3f" if ]*\)'` != ${REPLY} && "${REPLY}" != "0" ]] then print "is digit\n" else print "not digit\n" fi ############################ Although it works fine, but... (6 Replies)
Discussion started by: sleepy_11
6 Replies

9. UNIX for Dummies Questions & Answers

Run away "bootpgw" & "inetd"

Hello All. I'm get the following messages posted to the /var/adm/syslog file ever second and not sure on how to stop the process. May 14 15:50:52 a3360 bootpgw: version 2.3.5 May 14 15:50:52 a3360 inetd: /etc/bootpgw exit 0x1 As said about this gets logged every second only thing that... (4 Replies)
Discussion started by: cfaiman
4 Replies

10. UNIX for Advanced & Expert Users

How to supress a "Killed" message when a process is terminated?

Does anyone know how I can supress the "Killed" message that's produced when I kill a process? I've got a script that performs a "tail -f" on a database error log and pipes the output into an awk script which looks for certain error messages and forwards any that qualify to my pager. The problem... (2 Replies)
Discussion started by: kenwolff
2 Replies
Login or Register to Ask a Question