Wrong Array...

war stories

Login to Reply

Thread Tools Search this Thread
# 1  
Old 05-30-2018
Wrong Array...

Back in my early days, just starting at my first job where I was doing 100% UNIX...

We had a mixture of Solaris, AIX, and HPUX systems. Our most critical systems (Logistics) was running on a bunch of large HP 9000 servers. I was tasked with decommisoning one of the servers.

It was one of those arrangements where there was three cabinets in a row. One large HP 9000 on each side, and the middle was used to store the boot disks, tape drives, optical drives, etc.

Well, long story short.. I hit the power button on the wrong storage cabinet. Did a few other things... Casually walked back to my desk. It was a good 20 minutes later.

All of a sudden, people are screaming.. PROD IS DOWN.. PROD IS DOWN. I thought, great.. Another fun day....

A few minutes later, it dawned on me that the drives for the production server was in the same rack that I was decommissioning. So, I walked back down to the data center to verify... Sure enough. I hit the wrong button.

I turned the array back on, walked back to my desk, and could now login again, and everything was fine. To my surprise.. There was no fallout from this. This system just completely hung... once the storage was turned back on, it just kept on going where it left off.
These 3 Users Gave Thanks to xenix For This Post:
Corona688 (07-23-2018), dodona (06-02-2018), Don Cragun (1 Week Ago)
# 2  
Old 05-31-2018
These old HP were really great boxes...
# 3  
Old 05-31-2018
I remembered having no other alternative than to press the power of button on a K class, because I was called on a sunday between Xmas and NewYear, and all the expected batches produced nothing, when I asked what output or why only answer I got was the machine only responds to ping, only this was a very sensible machine, something ever happens and heads falls...
I went in the white room just to realise the box was out of control, you cant even connect from the console plugged directly to it.
So crossed my fingers and off..
Then just as you I had lived a long moment of solitude... to finally see the box up and running, even old Oracle was up...
As I knew I would be asked what happened , I went through all the activity logs and after an hour searching the batches crons etc... Found out it was a stupid developper that sent corrupted prints, no seeing them coming to the printer executed a few ( many...) times the his print job without looking the output ( queue down...) decided as it was friday to leave for Xmas holiday...
The sunday I was called was 10 days after...
This User Gave Thanks to vbe For This Post:
Don Cragun (1 Week Ago)
# 4  
Old 06-13-2018
Hi Guys,

This also goes back around 25 years when in the space of two weeks I was involved in two power switch incidents in the space of two weeks. The first incident was on a trading floor in the Minories in London, on a Data General Aviion system an 8550 I think. Anyway, while getting ready for a very short outage on the system - I found out that the Grey import that this company had sneaked onto the contract had a power interlock on one of the Zeus fasteners on the rear panel of the system.

The other incident was much more humerous from my perspective at least.

Whilst going past a comms room in a fairly deserted part of the building, my attention was attracted by shouts for help. I went into the comms room to see one of the software developers perched cross legged on a swivel chair infront of a big (for it's time) Data General Clariion Disk Array (60 Bay I think), when I asked what the problem was I did have to struggle to keep a straigh face.

The developer had spun round on his chair and the pointed toe of his shoe had depressed the power button on the array, he had had the common sense to not move so the button hadn't popped out and dropped the array power - but he'd been there for about an hour and was now suffering cramps.

Fortunately there was enough space to get a finger in and hold the button down, whilst we got a clean shutdown of the application and the system. However the developer had to be helped out of the swivel chair by others as we just left him there while verything was brought down.

The Clariion power button was modified with a Domestos bottle top, which had a hole cut in it being Araldited over the offending button.

Ah the good old days.


This User Gave Thanks to gull04 For This Post:
Don Cragun (1 Week Ago)
# 5  
Old 1 Week Ago
Here is my personal "power-loss-story" and it has nothing to do with a power-button:

Back around 2000 i was working for a data center, which - nothing really unusual - had a large ~1000hp diesel generator to back up its UPS. This diesel was tested every year once by switching it on for 2 minutes. Everything worked fine.

Now, what happens if you run such a system for two minutes a year? Disaster struck and showed to the responsible people exactly what happens in such a case: during a heavy thunderstorm the nearby transformer station (or, specifically, its large 6m-diameter capacitor was hit by a lightning. The whole area was suddenly without electrical current and the remains of the capacitor lay across half of the street in pieces not bigger than a few centimeters each.

First, our UPS kicked in - as planned. Then, when the UPS batteries went low, the diesel started. It ran for about 5-6 minutes, then stuck with a piston seizure because running keeps such a machine lubed. When it doesn't run all the bearings eventually run dry of oil. So far, so bad and our data center stood without electrical power.

What makes the story interesting, though, is: at the same time adear friend of mine (shockneck) worked for another data center across the street. So, two days after the incident, we sat in a cafe commiserating about life as an admin. And he told me, they had a UPS, backed by a diesel. And because they knew that engines have to run every now and then to remain properly lubed they switched it on every month and let it run for 10 minutes. Alas, their diesel gave up a few minutes after starting. You know, when letting it run so often you need to refill the tank now and then.....

These 4 Users Gave Thanks to bakunin For This Post:
Corona688 (1 Week Ago), Don Cragun (1 Week Ago), Peasant (1 Week Ago), RudiC (1 Week Ago)
# 6  
Old 1 Week Ago
I have similar story regarding UPS, diesel motor and not so automatic power switch Smilie
Needless to say i got a call in 3 in the morning, jumped on my bike, drove to DC to see sweaty engineers with laptops everywhere.
The senior management was also there, breading behind our necks - which did not speed up the process, rather slowed it down.

So i joined the party and started breaking sweat, to find some NFS clustered UFS filesystems needed fsck (big filesystems), but the clusterware timeout happend due to lasting to long.
Then when timeout occurred, the package was switched to another node, started fsck again ... nightmare loop to say.

Was happy since i prepared ZFS some days ago and clustered to do migration to new storage, so i offlined and unmanaged the package in question, mounted the UFS in read only and ran couple of rsyncs.
By the time everything else got up, everything but couple of TB filesystems mostly used for archive, were done.
The NFS was accessible immediately so apps can write, but data was continuously pouring for the rest of the day, and checksuming in the days that followed Smilie

It went pretty well (no major consequence in general, except downtime), for which part i blame enterprise storage batteries which held the data in cache and flushed it to disks when powered back on again.
Among other things, since this was a extremely mixed environment hardware and operating system wise.

This User Gave Thanks to Peasant For This Post:
Don Cragun (1 Week Ago)
Login to Reply

Thread Tools Search this Thread
Search this Thread:
Advanced Search

Similar Threads More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Help reading the array and sum of the array elements nishantrefound Shell Programming and Scripting 1 08-13-2016 08:43 AM
Bash arrays: rebin/interpolate smaller array to large array f77hack Shell Programming and Scripting 2 02-13-2016 05:34 PM
Pass array to a function and display the array Girish19 Shell Programming and Scripting 7 02-18-2015 12:26 PM
How to Assign an shell array to awk array? Ariean Shell Programming and Scripting 14 09-20-2013 01:34 PM
Bash 3.2 - Array / Regex - IF 3rd member in array ends in 5 digits then do somthing... briandanielz Shell Programming and Scripting 4 08-26-2013 02:28 PM
Why result is wrong here ? whether break statement is wrong ? Akshay Hegde Shell Programming and Scripting 2 03-11-2013 12:31 AM
Compare file to array, replace with corresponding second array gentlefury Shell Programming and Scripting 2 08-14-2012 02:30 PM
PERL : Read an array and write to another array with intial string pattern checks irudayaraj Shell Programming and Scripting 2 11-05-2011 09:11 AM
Array in Perl - Detect several file to be in one array sayachop Shell Programming and Scripting 3 09-06-2011 11:31 AM
Store all the passed arguments in an array and display the array dgmm Shell Programming and Scripting 3 05-18-2011 10:05 PM
perl, put one array into many array when field is equal to sth jimmy_y Shell Programming and Scripting 0 06-06-2010 06:59 AM
PHP: Search Multi-Dimensional(nested) array and export values of currenly worked on array. zeekblack Shell Programming and Scripting 1 12-07-2009 10:30 PM
Creating an array to hold posix thread ids: Only dynamic array works kmehta Programming 4 09-21-2008 09:24 PM
split and making an array inside another array dcfargo Shell Programming and Scripting 2 08-06-2008 11:07 AM
create array holding characters from sring then echo array. rorey_breaker Shell Programming and Scripting 5 09-28-2007 09:42 AM
All times are GMT -4. The time now is 06:05 AM.

Unix & Linux Forums Content Copyright 1993-2018. All Rights Reserved.