Sponsored Content
The Lounge What is on Your Mind? Self made monitoring application Post 302948883 by frustrated1 on Friday 3rd of July 2015 12:18:40 PM
Old 07-03-2015
Self made monitoring application

Hi.. Looking for advice / feedback

Work in IT in an operational team, number of years ago, all monitoring was manual, vast checklists for unix checks, checking space, checking application processes, files etc. filling in spreadsheets etc.

I took some basic scripting courses in ksh and overtime built up my skills.
Eventually started scripting some of the manual checks and alerting to a global text file that was monitored instead.
A year later decided to go a step further as there was no money for commercial IT monitoring tools.

So I set up a freeware database on an old unix server, ran a webserver and Solaris.
Put all events in to tables and designed a relatively simple but effective web page as a front end, so alerts could be seen clearly in one view and alerts auto clear etc.

Everything is managed by scripts, monitors space, processes, server response, some web page response times, file flows, connects to a lot of remote servers to monitor critical metrics and alert.
Also have started putting in some capacity / performance monitoring using the db to record and Google charts for graphical representation ..

It would never replace a commercial tool for large companies, but I am wondering if there may smaller companies with say 50-100 servers that something like this may interest.
As its all pretty much ksh scripts on top of some of freeware I mentioned, it's not encoded / packaged so I'm not sure how to go about things even if there was interest.
There has been great feedback internally, especially as cost is extremely low, literally a basic server and my time.
We are now investing in commercial tools as company has expanded

Interested in your thoughts
 

6 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Testing Monitoring Application

Hi there, I'm a newbie to using Solaris. I need to test an existing application that is monitoring applications/processes of its health and state. My task is to see if the application is doing the monitoring correctly. Everything is very new to me... please suggest some ideas of how I can... (2 Replies)
Discussion started by: laila63
2 Replies

2. UNIX and Linux Applications

Is there any way to find any changes made to a directory?

Hi groupies. Is there any way to find any changes made to a directory? Changes are adding a file to the directory or removing a file. That is, I need to get the name of the file which is added or removed. For adding, I... (2 Replies)
Discussion started by: ashokmeti
2 Replies

3. Cybersecurity

Help - Made a serious admin error

Hello, I am a newbie to Unix administration (specifically Solaris 9). I have everything setup properly for auditing but I neglected to realize I needed to start a new logfile each week. Thus the one logfile grew to about 2.5GB before the auditreduce command could no longer process the file. ... (4 Replies)
Discussion started by: jtbates
4 Replies

4. UNIX for Advanced & Expert Users

How to : Identify changes made with root ?

Thanks Avklinux (1 Reply)
Discussion started by: avklinux
1 Replies

5. UNIX for Dummies Questions & Answers

Check the changes made to file in vi

Hi, I use vi editor in Unix. Is there any way if we can know that what change was made to the file opened in vi before quitting? As i opened a huge file made some changes yesterday and didnt save it. Today when i was quitting the vi , i came to know that some changes are made(as i got... (6 Replies)
Discussion started by: kailash19
6 Replies

6. Post Here to Contact Site Administrators and Moderators

Fix a change I made

I made a change in either my my User Control Panel or Miscelleneous. I may have switched to a mobile format. This is what I do NOT want. Dropbox - NotWhatIWant.png This is how I would like to returned to. Dropbox - WhatIWant.png (4 Replies)
Discussion started by: drew77
4 Replies
HOBBIT-ALERTS.CFG(5)						File Formats Manual					      HOBBIT-ALERTS.CFG(5)

NAME
hobbit-alerts.cfg - Configuration for for hobbitd_alert module SYNOPSIS
~xymon/server/etc/hobbit-alerts.cfg DESCRIPTION
The hobbit-alerts.cfg file controls the sending of alerts by Xymon when monitoring detects a failure. FILE FORMAT
The configuration file consists of rules, that may have one or more recipients associated. A recipient specification may include additional rules that limit the circumstances when this recipient is eligible for receiving an alert. Blank lines and lines starting with a hash mark (#) are treated as comments and ignored. Long lines can be broken up by putting a back- slash at the end of the line and continuing the entry on the next line. RULES
A rule consists of one of more filters using these keywords: PAGE=targetstring Rule matching an alert by the name of the page in BB. This is the path of the page as defined in the bb-hosts file. E.g. if you have this setup: page servers All Servers subpage web Webservers 10.0.0.1 www1.foo.com subpage db Database servers 10.0.0.2 db1.foo.com Then the "All servers" page is found with PAGE=servers, the "Webservers" page is PAGE=servers/web and the "Database servers" page is PAGE=servers/db. Note that you can also use regular expressions to specify the page name, e.g. PAGE=%.*/db would find the "Database servers" page regardless of where this page was placed in the hierarchy. The PAGE name of top-level page is an empty string. To match this, use PAGE=%^$ to match the empty string. EXPAGE=targetstring Rule excluding an alert if the pagename matches. HOST=targetstring Rule matching an alert by the hostname. EXHOST=targetstring Rule excluding an alert by matching the hostname. SERVICE=targetstring Rule matching an alert by the service name. EXSERVICE=targetstring Rule excluding an alert by matching the service name. GROUP=groupname Rule matching an alert by the group name. Groupnames are assigned to a status via the GROUP setting in the hobbit- clients.cfg file. EXGROUP=groupname Rule excluding an alert by the group name. Groupnames are assigned to a status via the GROUP setting in the hobbit- clients.cfg file. COLOR=color[,color] Rule matching an alert by color. Can be "red", "yellow", or "purple". The forms "!red", "!yellow" and "!purple" can also be used to NOT send an alert if the color is the specified one. TIME=timespecification Rule matching an alert by the time-of-day. This is specified as the DOWNTIME timespecification in the bb-hosts file. DURATION>time, DURATION<time Rule matcing an alert if the event has lasted longer/shorter than the given duration. E.g. DURATION>1h (lasted longer than 1 hour) or DURATION<30 (only sends alerts the first 30 minutes). The duration is specified as a number, optionally followed by 'm' (minutes, default), 'h' (hours) or 'd' (days). RECOVERED Rule matches if the alert has recovered from an alert state. NOTICE Rule matches if the message is a "notify" message. This type of message is sent when a host or test is disabled or enabled. The "targetstring" is either a simple pagename, hostname or servicename, OR a '%' followed by a Perl-compatible regular expression. E.g. "HOST=%www(.*)" will match any hostname that begins with "www". The same for the "groupname" setting. RECIPIENTS
The recipients are listed after the initial rule. The following keywords can be used to define recipients: MAIL address[,address] Recipient who receives an e-mail alert. This takes one parameter, the e-mail address. SCRIPT /path/to/script recipientID Recipient that invokes a script. This takes two parameters: The script filename, and the recipient that gets passed to the script. IGNORE This is used to define a recipient that does NOT trigger any alerts, and also terminates the search for more recipients. It is use- ful if you have a rule that handles most alerts, but there is just that one particular server where you dont want cpu alerts on Monday morning. Note that the IGNORE recipient always has the STOP flag defined, so when the IGNORE recipient is matched, no more recipients will be considered. So the location of this recipient in your set of recipients is important. FORMAT=formatstring Format of the text message with the alert. Default is "TEXT" (suitable for e-mail alerts). "PLAIN" is the same as text, but without the URL link to the status webpage. "SMS" is a short message with no subject for SMS alerts. "SCRIPT" is a brief message tem- plate for scripts. REPEAT=time How often an alert gets repeated. As with DURATION, time is a number optionally followed by 'm', 'h' or 'd'. UNMATCHED The alert is sent to this recipient ONLY if no other recipients received an alert for this event. STOP Stop looking for more recipients after this one matches. This is implicit on IGNORE recipients. Rules You can specify rules for a recipient also. This limits the alerts sent to this particular recipient. MACROS
It is possible to use macros in the configuration file. To define a macro: $MYMACRO=text extending to end of line After the definition of a macro, it can be used throughout the file. Wherever the text $MYMACRO appears, it will be substituted with the text of the macro before any processing of rules and recipients. It is possible to nest macros, as long as the macro is defined before it is used. ALERT SCRIPTS
Alerts can go out via custom scripts, by using the SCRIPT keyword for a recipient. Such scritps have access to the following environment variables: BBALPHAMSG The full text of the status log triggering the alert ACKCODE The "cookie" that can be used to acknowledge the alert RCPT The recipientID from the SCRIPT entry BBHOSTNAME The name of the host that the alert is about MACHIP The IP-address of the host that has a problem BBSVCNAME The name of the service that the alert is about BBSVCNUM The numeric code for the service. From the SVCCODES definition. BBHOSTSVC HOSTNAME.SERVICE that the alert is about. BBHOSTSVCCOMMAS As BBHOSTSVC, but dots in the hostname replaced with commas BBNUMERIC A 22-digit number made by BBSVCNUM, MACHIP and ACKCODE. RECOVERED Is "1" if the service has recovered. EVENTSTART Timestamp when the current status (color) began. SECS Number of seconds the service has been down. DOWNSECSMSG When recovered, holds the text "Event duration : N" where N is the DOWNSECS value. CFID Line-number in the hobbit-alerts.cfg file that caused the script to be invoked. Can be useful when troubleshooting alert configura- tion rules. SEE ALSO
hobbitd_alert(8), hobbitd(8), xymon(7), the "Configuring Xymon Alerts" guide in the Online documentation. Xymon Version 4.2.3: 4 Feb 2009 HOBBIT-ALERTS.CFG(5)
All times are GMT -4. The time now is 07:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy