Daily checks for AIX business critical boxes.


 
Thread Tools Search this Thread
Operating Systems AIX Daily checks for AIX business critical boxes.
# 1  
Old 10-30-2009
Daily checks for AIX business critical boxes.

Hi all,

I will like to know what are all sanitary checks which should be done on daily basis on all business critical AIX boxes without fail.


# 2  
Old 10-30-2009
disk space, connectivity to the network, and security.
# 3  
Old 10-31-2009
Sorry, but your question is too general to be answered.

For instance, it might be critical to watch filesystem capacity if a lot of data is getting stored on the system and the data is not coming in a steady predictable stream. On the other hand there are systems where very little data is stored and the filesystem doesn't have to be watched closely at all. For some systems looking at it every month is enough, for others it is vital to monitor it hourly, yet many systems are somewhere between these extremes.

Specify your question a bit and we might be able to help you better.

bakunin
# 4  
Old 10-31-2009
is 'none' a valid answer?

If you have sufficient monitoring in place, there is no good reason to look directly after them on a daily basis at all - because I get a ticket or am called out in case of any issues. I do monthly capacity checks across my boxes and compare them with previous months - but basically this is all ...

Kind regards
zxmaus
# 5  
Old 10-31-2009
Quote:
Originally Posted by bakunin
Sorry, but your question is too general to be answered.

For some systems looking at it every month is enough, for others it is vital to monitor it hourly, yet many systems are somewhere between these extremes.

Specify your question a bit and we might be able to help you better.

bakunin
Ok as you said some systems have to be monitored hourly, so i want to know what are the things to be monitored hourly is it just restricted to FileSystem, Memory.....?

as i dont have a real time experience so this question Smilie

---------- Post updated at 08:26 PM ---------- Previous update was at 08:13 PM ----------

Quote:
Originally Posted by zxmaus
is 'none' a valid answer?

If you have sufficient monitoring in place, there is no good reason to look directly after them on a daily basis at all - because I get a ticket or am called out in case of any issues. I do monthly capacity checks across my boxes and compare them with previous months - but basically this is all ...

Kind regards
zxmaus
can you please explain what are the things covered in the sufficient monitoring..?

i think there is difference between a ticket being issued and checks on business critical boxes.
# 6  
Old 10-31-2009
Hi,
in my company ticket = callout within one minute / responsetime for us SAs 5 min for prod, 15 min for non-prod - and we have a lot of business critical systems (global trading- and transaction systems) - we cannot afford any downtime.

we monitor cpu (wait + idle + usage), avm memory + pagingspace, diskspace (defined per filesystem via thresholds), processes (by names and numbers), logfiles (for defined keywords), obviously errpt, SAN (i.e. if all paths are up), network, nfs shares, systems pingable/reachable and if throughput is within thresholds, backups - we even monitor if the monitoring is up ... and basically everything else you could possibly think ...

Kind regards
zxmaus
# 7  
Old 11-02-2009
Quote:
Originally Posted by deepm
Ok as you said some systems have to be monitored hourly, so i want to know what are the things to be monitored hourly is it just restricted to FileSystem, Memory.....?
Ask yourself what it is that keeps a system going (that is: fulfilling its purpose). This is your answer.

If anything has to be monitored every minute, hour, day, week or month depends on the system and the characteristics of its purpose. There is no general answer because there is no "general system".

If you ask "which is the best car" without specifying for which purpose the only thing one could answer is: that depends. If you want to transport tons of goods it might be some large truck and not the Ferrari, if you want to win races it might be the other way round and if you want to go offroad you will quickly find out that both are quite bad compared to a Landrover.

Coming back to your question: what does a system keep going:

a) environmental issues
- energy
- climate/temperature control
- ....

b) OS level
- availability of processing resources - CPU
- availability of memory
- availability of storage space - filesystem
- OS resources consumption: process table, etc.
- availability of network bandwith
- ...

c) application specific
- depends on the application, things like queue lengths, transaction times, ...


Be aware that this list is far from being complete, its just the most obvious things, feel free to add whatever is important for your system to continue working. As a rule: everything that is important for the system to continue doing its purpose you need a "sensor" - a logfile, a piece of software, a blinking warning lamp, what ever.

Some of the things might be already covered: you do not have to watch climate control if the system is in a data center where air condition is provided and covered for without you doing anthing. You still might want to watch over fans, etc. and get an alarm if the system starts overheating.

Speaking about the things left on the list: it depends on the system and what it is used for, how often something has to be checked. Because these checks take usually some processing power (in most cases little programs do the work) it is generally good to do the checks as often as necessary and as rarely as possible. If you have a system where never data gets stored (a gateway system, for instance) a check of the filesystems every minute is superfluous, on a database server it might be necessary. The same goes for CPU, network and all the other things on the list.

So there is no such thing as a "thing that has to be monitored hourly", because, whatever the thing in question is, depending on the specifics of the system one hour might be an overkill as well as far too little.

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. AIX

Setup 1 Digi PortServer II with 5 x AIX boxes

Hi everyone, My latest challenge sees me assisting with the setup of one of our classroom labs. We are trying to configure the lab so that the (5) IBM 9115-505 servers (all running AIX7.1) have their serial 0 ports connected to a Digi PortServer II, which is on the classroom LAN, along with the... (4 Replies)
Discussion started by: richardsantink
4 Replies

2. Red Hat

Cron entry for every 10 mints on business day business hour

Could you “crontab” it to run every 10 minutes on work days (Mo - Fr) between 08:00 and 18:00 i know to run every 10 mints but can any one guide me how to achieve the above one (2 Replies)
Discussion started by: venikathir
2 Replies

3. Shell Programming and Scripting

Daily Checks

Hey Guys, I'm seeking some assistance in getting this script to run as a cron job for the user oracle.. the script is basically to perform 2 ADRCI checks... see the script below... i'm getting the following error: /export/home/oracle/Daily_Checks/ADRCI_Daily_Checks.sh: syntax error at line 16:... (7 Replies)
Discussion started by: Racegod
7 Replies

4. AIX

Post mortem for critical Production AIX System Reboot/Crash

Hello All, Critical AIX production box crashed/rebooted while our team is working on it and we need to generate a detailed report for that, below are few questions that need to be included in the report. (We are System Administration team and everyone in our team has root access via sudo as well... (3 Replies)
Discussion started by: lovesaikrishna
3 Replies

5. UNIX for Dummies Questions & Answers

Daily File Checks

Hello all, I'm sorry if this is answered elsewhere, I've used the search function and can't find the specifics of what I'm after. I am brand new to playing with linux, and ideally I want to get better to help the company that I now work for. What I want to do: Create a script that I... (4 Replies)
Discussion started by: Aussiemick
4 Replies

6. AIX

Capturing Process on AIX boxes - IMP

Guys we all know what command 'COLUMNS=2047 /usr/bin/ps –eo pid,ppid,uid,user,args' does.It prints 5-column output for the running processes on a AIX box. Here is simple thing i need: I need to insert this tabular data in a db2 table. How do i need? I have created table with these five... (0 Replies)
Discussion started by: ak835
0 Replies

7. UNIX for Advanced & Expert Users

Allocate memory for a shell script in Aix at runtime-urgent critical

How to allocate memory for a shell script on aix box at the time of execution i.e at runtime Are there any commands for AIX in specific Thanks in Advance (1 Reply)
Discussion started by: aixjadoo
1 Replies

8. Programming

text boxes, radio buttons , check boxes in c++ on unix

Hi ! Please tell me how to get radio buttons, text boxes , check boxes , option buttons , pull down menus in C++ on Unix. I think it would be done using curses.h ..but that's all i know. TIA, Devyani. (3 Replies)
Discussion started by: devy8
3 Replies
Login or Register to Ask a Question