The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
GUI applications on SunSolaris and RedHat Linux shafi2all High Level Programming 4 04-16-2008 02:43 AM
Find cmd working in Linux and not in SunSolaris 5.8 navjotbaweja SUN Solaris 4 11-28-2007 10:21 AM
SunSolaris-v5.9: Default Security Settings eysheikah SUN Solaris 1 09-29-2005 07:59 AM
Where can I download a DigitalUnix 91service UNIX for Dummies Questions & Answers 1 02-12-2003 08:09 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 12-23-2002
Registered User
 

Join Date: Aug 2001
Posts: 58
how to monitor 25 different digitalUnix and Sunsolaris machines

Hi

I am a new Junior System Administartor..currently in our team we have around 25 different machines comprising of Sun-Solaris and digital Unix machines...and every morning we telent into the system to check
1)all demons are workin fine
2)all cpus are up
3)the memory is okay

But telnetting into 25 machines is a pain..i am planning to write a shell script taht will automatically sned out a mail to me incase there is a problem with any of the above..any ideas on how to start this..i would just need a pointer or an idea..then i would convert the idea into a shell script...or is there any better way to do it..

Hi i tried swatch..but unfortunately there are some perl modules which i am unable to compile because of gcc problems on sun solaris..

so somebody has a precompiled version of swatch for sun solaris

regards
Hrishy

Last edited by xiamin; 12-24-2002 at 05:49 AM.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 12-24-2002
Perderabo's Avatar
Unix Daemon
 

Join Date: Aug 2001
Location: Washington DC Area
Posts: 8,354
First of all there lots of products that do this-- Patrol, Openview, Big Brother, etc...

But here is an approach I that I once used:

First, I wrote a shell script that would check for various things. It wrote to an output file called something like status.`hostname`. The script would touch this file to make sure that it existed. And if it found anything that it didn't like, it would add an error line to the file.

I ported the script to all of our systems and a cron job would run it once an hour on the hour.

At 15 minutes after the hour, one system would run another script. This was an automated ftp job that would retrieve (and delete from the original host) all of the status files.

It is hard to detect errors during an automated ftp job. But after it runs, it is easy to see if the script obtained a status.`hostname` file for each hostname. If not, that is an error.

Also if the status files had any lines, that was an error.

The script would page me if it detected any errors.

This was a little clunky, but I had a monitoring system up and running in less than half a day.
Reply With Quote
  #3 (permalink)  
Old 12-24-2002
RTM's Avatar
RTM RTM is offline
Hog Hunter
 
Join Date: Apr 2002
Location: On my motorcycle
Posts: 3,039
For constant monitoring and no cost, check out Big Brother . You can monitor processes, cpu, disk...fantastic product.

You would still need something to check some other things (you could implement into BB but it's up to you).

This was written back in 1995 - not too many changes since then. Give you a quick snapshot of what has happened in the last 24 hours (runs once every 24 hours). Could be improved but one does not always have the time! Sorry it's in csh - but it's more for knowledge then use - all the servers (over 60) send the snapshot report to one server - a cron job collects all the info - if it doesn't find a report from a server in it's list, it reports that too.
This is the script that runs on each server - if nothing is wrong it sends a zero byte file (which proves the network connection is working). If you want the other script, post back. This works on Solaris 2.6 - does not have to run as root. Also have one for HP.

#!/bin/csh -f
# Created 09/21/95 HOG A script file to gather info from all Unix Systems
# ========SET UP SYMBOLS===========================
set defdir="/tmp"
set node="`hostname`"
set today="`date '+%m%d%y'`"
set theday="`date +'%d'`"
set thedate="`date +'%b %e'`"
set themonth="`date +'%m'`"
set theyear="`date +'%Y'`"
set tmpfile = "$defdir/SI$node.$today"
set y2kfile = "/opt/Y2K/sunscan.$node-$theyear.$themonth.$theday-*/README.$node"
set dailycopy = "oven:/usr/local/sysconfigs/daily"
set monthcopy = "oven:/usr/local/sysconfigs"
set fsmin = "5000"
/usr/bin/rm $defdir/SI$node.*
/usr/bin/touch $tmpfile
#
# ========RUN FOLLOWING COMMANDS ON ALL SYSTEMS====
# Check uptime
set lastboot = `/usr/bin/who -b | awk '{print $4" "$5}'`
if ("$lastboot" == "$thedate") echo "`/usr/bin/who -b`" >> $tmpfile
# Check space on local filesystems
set filesys = `df -bl |grep dsk|grep -v vol|/usr/bin/awk '{print $1}'`
foreach fs ($filesys)
set fs1 = `/usr/bin/df -kl $fs|grep dsk|/usr/bin/awk '{print $4}'`
set fson = `/usr/bin/df -kl $fs|grep dsk|/usr/bin/awk '{print $6}'`
if ($fs1 < $fsmin) then
echo "$fson is at $fs1 kilobytes" >> $tmpfile
endif
end
# Check for OV status
if (-e /opt/OV/bin/ovstatus) then
if ("$node" == "casc-nms128") then
# do nothing - loaded but not running
else
set ovstat = `/opt/OV/bin/ovstatus |/usr/bin/grep -c RUNNING`
if ($ovstat < 5) echo "Only $ovstat OV processes running. Please
check." >> $tmpfile
endif
endif
# Check on meta disks
if (-e /usr/opt/SUNWmd/sbin/metastat) then
set mdstat = `/usr/opt/SUNWmd/sbin/metastat|/usr/bin/grep "State:"|awk '
{print $2}'|/usr/bin/grep -cv "Okay"`
if ($mdstat > 0) then
/usr/bin/echo "$mdstat errors found in metastat" >> $tmpfile
endif
endif
# Check on volume manager disks - normal user can't run vxdisk
if (-e /usr/sbin/vxprint) then
set vxstat = `/usr/sbin/vxprint |grep -ic "recover"`
if ($vxstat > 0) then
/usr/bin/echo "$vxstat errors found in vxprint" >> $tmpfile
endif
endif
# Check prtdiag for errors
if (-e /usr/platform/`uname -i`/sbin/prtdiag) then
set prtdiag = "/usr/platform/`uname -i`/sbin/prtdiag"
set prtdiagstat = `$prtdiag | grep -c "No failures found in System"`
if ($prtdiagstat < 1) then
/usr/bin/echo "Prtdiag shows system errors" >> $tmpfile
endif
endif
#
/usr/bin/rcp $tmpfile $dailycopy
if ("$theday" == "01" && "$node" != "oven") then
if (-e /opt/Y2K) then
/usr/bin/rcp $y2kfile $monthcopy
endif
endif
if (-e /tmp/$node.all) then
/usr/bin/rcp /tmp/$node.all $monthcopy
/usr/bin/mv /tmp/$node.all /tmp/$node.old
endif
# ==================================================
exit
Reply With Quote
  #4 (permalink)  
Old 12-25-2002
Registered User
 

Join Date: Aug 2001
Posts: 58
Hi

Thank you very very much....i am also lokin at NAGIOS..i am awaiting your reviews on this one..and yeah..thank yopu for that cool cool..script and merry xams to all these wonderful people here..;-D

regards
Hrishy
Reply With Quote
  #5 (permalink)  
Old 12-27-2002
BSeanD's Avatar
Registered User
 

Join Date: Aug 2002
Location: Melbourne, Australia
Posts: 127
Quote:
Thank you very very much....i am also lokin at NAGIOS
I was going to suggest the same thing. We used Netsaint at my last job, written by the same person that is maintaining Nagios. It was very detailed and made looking after boxes around the country very easy.
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 08:30 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0