Sponsored Content
Operating Systems AIX Role of sys admin during power outage in Data center Post 302447373 by zxmaus on Monday 23rd of August 2010 01:38:59 AM
Old 08-23-2010
Well this largely depends on the lenght of your outage and your datacentre setup.

In our company we have redundant power suppliers and where possible redundant power supplies. We have as well our own generators which can keep our estate alive for 48 hrs. We have annual tests that are simulating worst case scenarios to make sure that we can continue working as if nothing happens. Still every year we have systems that go down as there are no redundant powersupplies or one of these failed unnoticed.

You as a system administrator have to make sure that you do not loose data and that you can keep your operations running. You might have to fail certain systems over to a DR environment. Or like Mike said, you have at least to shutdown your boxes gracefully. If you have systems that are up all day for months or even years, it might be even a good idea to perform an alt_disk_install to have a separate bootable clone of your box in case it doesnt come up cleanly - large differences in temperature of internal disks can corrupt filesystems and destroy disks.

Once your power supply is back up you need to check your system logs for possible failures / error messages ... sometimes loss of power can cause hardware failures.

Kind regards
zxmaus
 

9 More Discussions You Might Find Interesting

1. Solaris

Raid help for new sys admin

Hi, I have a Sunfire v240 with 4 72GB internal disks (drive 0 is the system disk). Scratch the following ( Revised requirements below) I need to set up a Raid using the remaining 3 disks for a Oracle database, but don't know how to do it (or the size of the database). I don't know if... (1 Reply)
Discussion started by: antalexi
1 Replies

2. What is on Your Mind?

Network Sys Admin

Hi, my name is wesly. I an IT tech at the Junior Level. I have a bachelors degree in Computer Information Systems. I would like to fully become a Unix Sys Admin. Can anyone please tell me what I have to do. Do I have to set up a Unix or Linux server. How about Apache. Please give me clues and tips... (1 Reply)
Discussion started by: wes.lat
1 Replies

3. Solaris

Booting up problem after power outage

hi guys, i'm new so don't bite too hard. having a problem booting up a V210 running sol9 on after a power outage... an init5 was done but not a init0 before the power cut... so now when booting up it gives the ff: SC Alert: Host System has Reset Probing system devices Probing memory... (2 Replies)
Discussion started by: lungsta
2 Replies

4. AIX

Help training for Unix Sys Admin

I have worked on AIX for a number of years now and want to upskill to Sys Admin. My problem is my employer won't give root access etc to their servers so I must find my own way of training. Can anyone suggest a virtual environment I could use to train for AIX sys admin tasks, or suggest an old... (7 Replies)
Discussion started by: jackmeadow
7 Replies

5. Solaris

Sys Admin Certification

Hi, I am planning to get certified in Solaris 10 for my own interest. Although I don't have much experience in sys admin, I have got some background in scripting and some sort of beginner level administration. But I have read in many places that one must have a good amount of experience in sys... (8 Replies)
Discussion started by: King Nothing
8 Replies

6. Linux

Administrator responsibilities, in case of power outage?

Hi guys, I was wondering if you could share some of your knowledge, in the event of a power outage. Let presume you are on duty and you get a call at midnight because half of your cabinets have no power, air conditioning is down and you deal with a ton of 500 error messages on your boxes. ... (9 Replies)
Discussion started by: TECK
9 Replies

7. AIX

System can't boot up after power outage

Hello Forum, I am very newbie with AIX. We have 2 AIX 9111-285 servers. The OS version is 5.3. After the power outage, they did not come up. I try to unplug the power cable and re-connect after 1 minutes but do not help. Both display the same reference code 110000AC on the front panel... (6 Replies)
Discussion started by: lilyn
6 Replies

8. AIX

Automatic Server bootup after power outage?

Hi everyone, We had a power outage few days ago, and I got the servers up and running but I was informed to look into, if there is a way to bring up the servers automatically/defaultly. I was told the windows admin has their server set up where the servers are up automatically if there is a... (11 Replies)
Discussion started by: Adnans2k
11 Replies

9. What is on Your Mind?

OUTAGE: Data Center Problem Resolved.

There was a problem with our data center today, creating a site outage (server unreachable). That problem has been resolved. Basically, it seems to have been a socially engineered denial-of-service attack against UNIX.com; which I stopped as soon as I found out what the problem was. Total... (2 Replies)
Discussion started by: Neo
2 Replies
FAQ(1)								       mrtg								    FAQ(1)

NAME
help - How to get help if you have problems with MRTG SYNOPSIS
MRTG seems to raise a lot of questions. There are a number of resources apart from the documentation where you can find help for mrtg. FAQ
Alex van den Bogaerdt <alex@ergens.op.Het.Net> maintains the MRTG FAQ website on http://faq.mrtg.org In the following sections you find some additonal Frequently Asked Questions, with Answers. Why is there no "@#$%" (my native language) version of MRTG Nobody has contributed a @#$%.pmd file yet. Go into the mrtg-2.9.17/translate directory and create your own translation file. When you are happy with it send it to me for inclusion with the next mrtg release. I need a script to make mrtg work with my xyz device. Probably this has already been done. Check the stuff in the mrtg-2.9.17/contrib directory. There is a file called 00INDEX in that directory which tells what you can find in there. How does this SNMP thing work There are many resources on the net, explaining about SNMP. Take a look at this article from the Linux Journal by David Guerrero: http://www.develnet.es/~david/papers/snmp/ And at this rather long document from CISCO http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/snmp.htm The Images created by MRTG look very strange. Remove the *-{week,day,month,year}.png files and start MRTG again. Using MRTG for the first time, you might have to do this twice. This will also help, when you introduce new routers into the cfg file. What is my Community Name? Ask the person in charge of your Router or try 'public', as this is the default Community Name. My graphs show a flat line during an outage. Why ? Well, the short answer is that when an SNMP query goes out and a response doesn't come back, MRTG has to assume something to put in the graph, and by default it assumes that the last answer we got back is probably closer to the truth than zero. This assumption is not per- fect (as you have noticed), it's a trade-off that happens to fail during a total outage. If this is an unacceptable trade-off,use the unknaszero option. You may want to know what you're trading off, so in the spirit of trade-offs, here's the long answer: The problem is that MRTG doesn't know *why* the data didn't come back, all it knows is that it didn't come back. It has to do something, and it assumes it's a stray lost packet rather than an outage. Why don't we always assume the circuit is down, and use zero, which will (we think) be more nearly right? Well, it turns out that you may be taking advantage of MRTG's "assume last" behaviour without being aware of it. MRTG uses SNMP (Simple Network Management Protocol) to collect data, and SNMP uses UDP (User Datagram Protocol) to ship packets around. UDP is connectionless (not guaranteed) - unlike TCP where packets are tracked and acknowledged and, if needed, re-transmitted, UDP just throws packets at the network and hopes they arrive. Sometimes they don't. One likely cause of lost SNMP data is congestion, another is busy routers. Other possibilities include transient telecommunications prob- lems, router buffer overflows (which may or may not be congestion-related), "dirty lines" (links with high error rates), and acts of God. These things happen all the time, we just don't notice because many interactive services are TCP-based and the lost packets get retransmit- ted automatically. In the above cases where some SNMP packets are lost but traffic is flowing, assuming zero is the wrong thing to do - you end up with a graph that looks like it's missing teeth whenever the link fills up. MRTG interpolates the lost data to produce a smoother graph which is more accurate in cases of intermittent packet loss. But with V2.8.4 and above, you can use the "unknaszero" option to produce whichever graph is best under the conditions typical of your network. AUTHOR
Tobias Oetiker <oetiker@ee.ethz.ch> 3rd Berkeley Distribution 2.9.17 FAQ(1)
All times are GMT -4. The time now is 09:27 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy