False alerts


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting False alerts
# 15  
Old 02-24-2017
Hi,

In that case, if Nagios is already monitoring the load of your server...what is it you're hoping to achieve by running your own separate load monitoring script ? What does it do differently from what the Nagios check does, and would it not be possible to amend the Nagios check to do whatever you want so you only have one single check ?
# 16  
Old 02-24-2017
Quote:
Originally Posted by drysdalk
Hi,

In that case, if Nagios is already monitoring the load of your server...what is it you're hoping to achieve by running your own separate load monitoring script ? What does it do differently from what the Nagios check does, and would it not be possible to amend the Nagios check to do whatever you want so you only have one single check ?

Hi

we need to get email alerts and monitor our app with out any delay Smilie
# 17  
Old 02-24-2017
Quote:
Originally Posted by anil529
Hi drysdalk

Quote:
I doubt a script this simple could cause any problems for Nagios. What monitoring is already configured in Nagios ? Are these load alerts that you regard as false coming as e-mails from your script, or as alerts from Nagios ? And can you please provide the crontab entry so it can be ruled out as a cause ?

Yes Nagios is already monitoring load ,
0,15,30,45 * * * * /unixmon/servermon.p > /dev/null 2>&1
*/5 * * * * /etc/applicationMonitoring.sh

---------- Post updated at 03:04 PM ---------- Previous update was at 02:59 PM ----------



I have placed rm command at the top will it make difference ?
you don't need a temp file. How about this for the trailing portion of your script:
Code:
CPU_LOAD=`sar -P ALL 10 1 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}' |cut -d \. -f1`
if [ $CPU_LOAD -gt $THRESHOLD ]; then
    echo "CPU notification on $HOSTNAME is ${CPU_LOAD}% " `date`  | mail -s "Check CPU usage on $HOSTNAME `date`  " $ALERT
fi

Moderator's Comments:
Mod Comment edit by bakunin: fixed lost QUOTE-tag

Last edited by bakunin; 02-27-2017 at 06:37 AM..
This User Gave Thanks to vgersh99 For This Post:
# 18  
Old 02-24-2017
Quote:
Originally Posted by vgersh99
Quote:
Yes Nagios is already monitoring load ,
0,15,30,45 * * * * /unixmon/servermon.p > /dev/null 2>&1
*/5 * * * * /etc/applicationMonitoring.sh

---------- Post updated at 03:04 PM ---------- Previous update was at 02:59 PM ----------



I have placed rm command at the top will it make difference ?
you don't need a temp file. How about this for the trailing portion of your script:
Code:
CPU_LOAD=`sar -P ALL 10 1 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}' |cut -d \. -f1`
if [ $CPU_LOAD -gt $THRESHOLD ]; then
    echo "CPU notification on $HOSTNAME is ${CPU_LOAD}% " `date`  | mail -s "Check CPU usage on $HOSTNAME `date`  " $ALERT
fi


I can change and test Smilie
Moderator's Comments:
Mod Comment edit by bakunin: fixed missing QUOTE-tag

Last edited by bakunin; 02-27-2017 at 06:40 AM..
# 19  
Old 02-24-2017
Hi,

(Edit: forgot to include the date in the mail)

Here's a version of your script that's as streamlined as I've been able to make it:

Code:
#!/bin/bash

hostname="`/bin/hostname`"
date="`/bin/date`"
load="`/usr/bin/sar -P ALL 10 1 | /usr/bin/awk '$1 == "Average:" && $2 == "all" {print 100-$NF}'`"

threshold="90.00"
recipient="unixforum@localhost"
subject="Load alert on host $hostname"
body="Load is $load, date is $date"

if [[ "$load" > "$threshold" ]]
then
        echo "$body" | /usr/bin/mail -s "$subject" "$recipient" >/dev/null 2>/dev/null
        exit 1
else
        exit 0
fi

Again in my own local tests this worked fine, but then so did your original. You may need to amend paths to things like sar, awk, etc (it's always a good idea to use fully-qualified paths in scripts that will be run via crontab).

Hope this helps.

Last edited by drysdalk; 02-24-2017 at 05:57 PM..
This User Gave Thanks to drysdalk For This Post:
# 20  
Old 02-24-2017
Quote:
Originally Posted by vgersh99
Quote:
Code:
#!/bin/sh
#description ...
THRESHOLD=90
ALERT="monitoringbox@abc.com"
TEMPFILE=/tmp/temp1
HOSTNAME=`hostname`
rm -f $TEMPFILE
CPU_LOAD=`sar -P ALL 10 1 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}' |cut -d \. -f1`
if [[ $CPU_LOAD > $THRESHOLD ]];
then
echo "CPU notification on $HOSTNAME is ${CPU_LOAD}% " `date`  >> $TEMPFILE
fi
if [ -e $TEMPFILE ]
then
mail -s "Check CPU usage on $HOSTNAME `date`  " $ALERT < $TEMPFILE
fi

Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!
... ... ...
I'd suggest changing this:
Code:
if [[ $CPU_LOAD > $THRESHOLD ]];

to
Code:
if [ $CPU_LOAD -gt $THRESHOLD ];

Also I'd debug what the value of CPU_LOAD is actually at the time of the email being sent.

Another, question.... you're doing this and checking of the existing of a file later:
Code:
echo "CPU notification on $HOSTNAME is ${CPU_LOAD}% " `date`  >> $TEMPFILE

Sounds like after the FIRST triggered condition, the file will be APPENDED to and any subsequent run of the script will trigger an email.
Don't you want to remove the file AFTER the condition has been triggered?
Hi vgersh99,
Good catch on the -gt versus > test. Without that change, it could miss reporting that the CPU load was 100% (but it still shouldn't have caused any false high load reports). Note that the script we were shown in post #4 didn't have this bug.

Note the code marked in red above. I agree wholeheartedly that the temp file is not needed (and suggested removing it back in post #5 in this thread), but the temp file is removed before it is appended to in the code you're questioning, so that shouldn't have caused any false high load reports either (assuming the code shown to us in post #9 is the actual code being run).

Quote:
Originally Posted by anil529
you don't need a temp file. How about this for the trailing portion of your script:
Quote:
Code:
CPU_LOAD=`sar -P ALL 10 1 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}' |cut -d \. -f1`
if [ $CPU_LOAD -gt $THRESHOLD ]; then
    echo "CPU notification on $HOSTNAME is ${CPU_LOAD}% " `date`  | mail -s "Check CPU usage on $HOSTNAME `date`  " $ALERT
fi

[/QUOTE]

I can change and test Smilie
Hi anil529,
Why don't you also make the changes I suggested in post #5 in this thread (where I also proposed getting rid of the temp file) and get rid of two unneeded processes that are adding unneeded load to the system you're trying to monitor? Smilie

If you decide to try drysdalk's suggestion instead, at least note that you must change the:
Code:
if [[ "$load" > "$threshold" ]]

to:
Code:
if [[ "$load" -gt "$threshold" ]]

as noted above by vgersh99 to keep from missing reports if the load reaches 100%.
This User Gave Thanks to Don Cragun For This Post:
# 21  
Old 02-27-2017
If we add monitoring scripts on the server will it increases server load
say 5 scripts running all time ?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Url check creating child process and generating false alerts

Hi All Below code is working as expected but creating too many child processes when the url is not up and every minute that process is sending false email alerts any help with the logic not to generate child process and not to send duplicate alerts app="https://url" appresult=$(wget... (2 Replies)
Discussion started by: srilinux09
2 Replies

2. Red Hat

Nagios is sending critical false alerts about current users

Hello All, Nagios seems to be sending false alerts about few hosts, (ex: There were no users on one host and still Nagios was reporting a critical alert and says 6 users are logged in. How do I fix this one? Also, I have installed nagios and added 12 hosts as a start and monitoring few... (4 Replies)
Discussion started by: lovesaikrishna
4 Replies

3. AIX

Gid=0 and 7 + admin=FALSE

Checking configuration access files for an AIX server, left me wondering about this :confused:: If a user is added to system group, it gets gid=0 with some security risks because it gets some root kind of file access level. Is this insecure condition kept if the user has admin variable... (0 Replies)
Discussion started by: bkiddo
0 Replies

4. IP Networking

false tcp connection

Why this happens? How to solve this? $netstat -na |grep 9325 tcp 0 0 127.0.0.1:9325 127.0.0.1:9325 ESTABLISHED When a client socket repeatedly tries to connect to an inactive(no server socket is listening on this port) local port,connect succeeds. ... (1 Reply)
Discussion started by: johnbach
1 Replies

5. Shell Programming and Scripting

False Condition

Hi All, I am using the below Script to enter a line in the File: #!/bin/ksh # To delete the last line if it contains the pattern Redirect permanent / Virgin Atlantic Airways - Popup echo "Enter the URL that should point to the particular microsite" read url # To delete the last line if it... (0 Replies)
Discussion started by: Shazin
0 Replies

6. Solaris

False Memory usage alarm!!

Hi Experts, I am using Solaris-10, Sun-Fire-V445. i got often the below message- "Memory Usage – Critical, Memory usage (RAM) exceeding 90% The memory utilization is exceeding 90%" in a application running on solaris. I checked with Vmstat. Everything seems to be fine. Where i should... (5 Replies)
Discussion started by: thepurple
5 Replies

7. UNIX for Advanced & Expert Users

will sftp work with /bin/false

helo helo I have create user for the group and entry for the user in /etc/passwd file is liek this bhavin:x:2014:109:test:/home/pds_RBAC:/bin/false I have keep here /bin/false now i m accesing user through sftp ow when i access that user using sftp from the another linux pc for e.g... (1 Reply)
Discussion started by: amitpansuria
1 Replies

8. Shell Programming and Scripting

Why is it always false?

Hi, I'm new to UNIX and am trying to learn shell scripting in order to work on an interface that I inherited when a co-worker left. I need to be able to check to see whether a file exists to determine whether the FTP has taken place, but in testing, the if statement always evaluates as false,... (3 Replies)
Discussion started by: JeffR
3 Replies

9. Shell Programming and Scripting

false use of sed???

i want to delete every newline and every line which starts with "RECORD......." in a file. FILE: Record 61391 in base BROCKHAUS (Timestamp: 2008-04-09 11:38:38) UNTERTITEL : Gräfin (seit 1707 Reichsgräfin) von, * Schwerin 4. 2. 1686, + Berlin 21. 10. 1744; wurde Record 61392 in base BROCKHAUS... (4 Replies)
Discussion started by: trek
4 Replies

10. Linux

bin\false

We have requirments to not allow a userid login abilities but allow users to 'su' to it. In solaris I normally set the shell in /etc/passwd to bin/false. THis does not work on Linux, any suggestions would help. (1 Reply)
Discussion started by: bryanthomas
1 Replies
Login or Register to Ask a Question