Sponsored Content
The Lounge What is on Your Mind? Tell us about your most recent system incident Post 302523080 by sparcguy on Tuesday 17th of May 2011 10:53:27 PM
Old 05-17-2011
Tell us about your most recent system incident

maybe we can start a thread to keep a record of administration changes made by yourself or other people but later blew into a huge incident affecting many users.

I'll start first. Recently due to security requirements we decided to disallow ftp usage to all users on all our servers by updating the /etc/ftpusers. But we also wanted to avoid duplication of work when people leave and we'd have to delete their accounts but sometimes forget to update /etc/ftpusers so we decided to have a script do this job for us.

This is the script I came up with that we put into all our servers. Basically it grabs every user in /etc/passwd and updates into /etc/ftpusers and runs once a month from crontab.

Code:
/usr/bin/cp -p /etc/passwd /etc/passwd.`date +%d%m%y`
if [[ -s /etc/ftpd/ftpusers ]] then
        /usr/bin/cp -p /etc/ftpd/ftpusers /etc/ftpd/ftpusers.`date +%d%m%y`
        /usr/bin/cat /etc/passwd | cut -d: -f1 > /etc/ftpd/ftpusers
else
        /usr/bin/touch /etc/ftpd/ftpusers
        /usr/bin/cat /etc/passwd | cut -d: -f1 > /etc/ftpd/ftpusers
fi

We did this very minor change on a friday and by monday totally forgot about it. When monday morning came around application folks complained a strange problem. One of their more critical apps had problems re-starting. On this solaris server due to configuration max_nprocs was set to 400 in the /etc/system and ps -ef showed 399 processes. The os couldn't fork anymore processes and server became very sluggish we wrestled with the problem for hours shutdown apps and database and finally decision came to do an emergency reboot in the afternoon.

By evening the system had slowly built itself up to around 400 processes and the problem resurfaced again. We went thru all the processes and
realized that one other application showed consistent errors from the logs, we also saw this application which does some "migration" activity had a large number of backlog processes.

One of the good things about Solaris operating system is that it has a command called truss. We manually ran the command with 'truss' and from the debug output managed see that it was trying to logon to the backend storage server via ftp service but complained about a 'login mismatch' so in the meantime the number of file transfer requests started growing and this began to have a 'knock on' effect on the other applications. Once we excluded that user from /etc/ftpusers on the backend server we saw a substantial drop in the number of process and things started to normalize.

Mistakes: didn't do 'last | grep ftp' as a pre-check before implementing script. Smilie

Last edited by pludi; 05-18-2011 at 08:44 AM..
 

We Also Found This Discussion For You

1. Shell Programming and Scripting

To keep only the most recent files

Hi all, I'm running on a Sun Solaris machine. I would only want to keep the last 2 most recent files on 1 of my directory. Below shows my script, but it is incomplete. For the ?? part I do not know how to continue. please help:confused: DIR=/tmp/abc OUTPUT=/tmp/output.out... (1 Reply)
Discussion started by: *Jess*
1 Replies
AUTRACE:(8)                                               System Administration Utilities                                              AUTRACE:(8)

NAME
autrace - a program similar to strace SYNOPSIS
autrace program [-r] [program-args]... DESCRIPTION
autrace is a program that will add the audit rules to trace a process similar to strace. It will then execute the program passing arguments to it. The resulting audit information will be in the audit logs if the audit daemon is running or syslog. This command deletes all audit rules prior to executing the target program and after executing it. As a safety precaution, it will not run unless all rules are deleted with auditctl prior to use. OPTIONS
-r Limit syscalls collected to ones needed for analyzing resource usage. This could help people doing threat modeling. This saves space in logs. EXAMPLES
The following illustrates a typical session: autrace /bin/ls /tmp ausearch --start recent -p 2442 -i and for resource usage mode: autrace -r /bin/ls ausearch --start recent -p 2450 --raw | aureport --file --summary ausearch --start recent -p 2450 --raw | aureport --host --summary SEE ALSO
ausearch(8), auditctl(8). AUTHOR
Steve Grubb Red Hat Jan 2007 AUTRACE:(8)
All times are GMT -4. The time now is 05:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy