AIX Production Readiness Checklist (PRC) - check list


 
Thread Tools Search this Thread
Operating Systems AIX AIX Production Readiness Checklist (PRC) - check list
# 1  
Old 07-23-2013
AIX Production Readiness Checklist (PRC) - check list

Hello Everyone,

Can anyone please provide me the checklist for validating our newly built AIX LPARs. AIX is new in our environment. So I'm looking for a reference document or checklist to verify new LPARs.

I believe most of the companies does have some kind of check list to verify. please provide me any kind of general reference checklist. (*i know it depends on environment, but looking for a generalized doc)

I am trying put the things together, so that it will be useful for new builds.

Thanks,
# 2  
Old 07-24-2013
I would consider always:-
  • Hardware or Software disk protection (RAID or SAN disk -or- LVM mirrors)
  • Clustering (if applicable) will bring it's own testing issues
  • Boot scripts to start applications
  • Dependencies and power-on sequences of other servers confirmed
  • Backups
    1. Save volume group definitions to a file in the rootvg
    2. Backup rootvg with mksysb to removable media.
    3. Backup all filesystems to removable media
    4. Cycle media off-site frequently - up to you how long to have the cycle
    5. DR procedures and hardware available (or contracted) and tested
  • Security
    1. Root password changed and secured.
    2. Prevent root login except on console.
    3. Sudo rules in place to allow authorised users ability to do their job
    4. Syslog configured and logs suitably sized to hold enough
    5. Unnecessary ports closed (tftp etc.) through /etc/inetd.conf
    6. Password rules set appropriately, forcing complexity, expiry, history, etc.
    7. Login prompt not suggesting any sort of Welcome (can be argued to be inviting)
  • ...... and probably loads more things.


I hope that these are a starter.



Robin
Liverpool/Blackburn
UK
# 3  
Old 07-24-2013
To make things easier, you can just install 1 LPAR as you want it with all standard things installed and configured like Robin said, which will all LPARs have in common.
You can take a backup of this installation with mksysb and use this "golden" or base image from your NIM server as installation source for new LPARs.
# 4  
Old 07-24-2013
@ Robin
thanks for your valuable info. I will make a note of it,

@ Zaxxon
Thanks for your response. Its good we are following the same.
# 5  
Old 07-24-2013
There is one more point, i think: restart time after disaster.

Consider several levels of disaster:
  • software breaks
  • hardware breaks
  • site disaster (complete datacenter down)

Now, for each of these scenarios work out:
  • the cost of downtime
    This is individual to the system and might range from zero (testing system) to some really big amount (mission-critical production system). Ask business for their estimation, because ultimately they will use it, not you.
  • the time it takes to rebuild a working system
    Estimate the times for each of these scenarios. For instance: hardware breaks down. How long will it take to get a new system, how long will it take to reinstall it and restore backups? This is about as long as the system is going to be down.
  • ways to shorten that time and an estimation how much that would cost
    When you have gotten the estimation from business how much is at risk it is easy to assess if a certain way to improve recovery time is worth it or not.

    Additionally you make your life easier, because usually systems tend to be absolutely uncritical at all - until they break down. Then they are suddenly very, very important and the company is losing huge amounts of money - all because of you! WIth the business' estimation in hand you can pass the ball back to them: you said it isn't important and we should not invest in ways to improve recovery time, now you tell me the system is mission-critical? Are you lying now or were you lying then?

Among ways to improve on recovery time are HA-systems, which will reduce recovery-time to near zero: there is things like disk mirroring or other redundant hardware, there are redundant systems (HACMP) and there are even cross-site-solutions by which it is possible to survive site disasters. It is easy to argue the costs for such things once you have a risk you can oppose these.

Another point on my checklists is usually the SLA (service level agreement), which overlaps ith what i said above: business (=customer) has to say how much of what he wants to have: does the system have to be online non-stop? Will there be downtime windows for maintenance? If yes, how often, how long? Is there a backup plan? How much has to be backed up and how often? How long does it have to be stored? What will the system do and how fast will it have to do it?

All these things should be negotiated and agreed upon. It is easy to create anything the customer is willing to pay for - but the most common way is that customer wants everything but is willing to pay for nothing. Negotiate some compromise and make them agree in documented form (inside a company some e-mail is good enough), but definitel make it documented! Just having Frank from accounting say "i understand" in some meeting is going to be forgotten as soon as the meeting is over (if not sooner).

I hope this helps.

bakunin

Last edited by bakunin; 07-24-2013 at 03:40 PM..
This User Gave Thanks to bakunin For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

awk two lines with two checklist

Hi there, I am trying to get two lines of the checklist 2.3; 2.4; 2.5; $file='Input1 Input2' echo -e "2.3 Only enable ftp when necessary" cat $file | awk '/2.4/ {P=0} /2.3/ {P=1} P' | grep -iq "not installed" && echo T || echo F echo -e "2.4 Only enable rlogin/rsh/rcp when... (0 Replies)
Discussion started by: alvinoo
0 Replies

2. UNIX for Dummies Questions & Answers

Combine 2 Outputs with a single Checklist

Checklist 1.1; Contains Solaris 1.2; Contains Patches 1.3; <no output> 1.3.1; <no output> Output1 1.1 Solaris 10 8/07 s10s_u4wos_12b SPARC 1.2 Patch: 127714-03 Obsoletes: Requires: 120011-14 Incompatibles: Packages: SUNWsshcu, SUNWsshdu, SUNWsshu Patch: 128253-01 Obsoletes:... (5 Replies)
Discussion started by: alvinoo
5 Replies

3. AIX

Production Issue in AIX Oracle RAC [errpt output : DUPLICATE IP ADDRESS DETECTED IN THE NET]

1)We have 2 node cluster RAC on AIX: ->test1 ->test3 2) After rebooting server both the node sequentailly, we are getting below error from errpt command : # errpt |more IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION FE2DEE00 0901223914 P S SYSXAIXIF DUPLICATE IP ADDRESS... (2 Replies)
Discussion started by: manjusharma128
2 Replies

4. Emergency UNIX and Linux Support

AIX: Production email issue

Hello, system generated emails sent to users from production scripts within Aix arent going out. In the errpt -a output I see: _______________________________________________________ LABEL: SRC_SVKO IDENTIFIER: BC3BE5A3 Date/Time: Tue Mar 13 16:28:07 EDT 2012 Sequence... (2 Replies)
Discussion started by: NycUnxer
2 Replies

5. AIX

Post mortem for critical Production AIX System Reboot/Crash

Hello All, Critical AIX production box crashed/rebooted while our team is working on it and we need to generate a detailed report for that, below are few questions that need to be included in the report. (We are System Administration team and everyone in our team has root access via sudo as well... (3 Replies)
Discussion started by: lovesaikrishna
3 Replies

6. BSD

Copying OpenBSD Kernel from a non production to production machine

Hi All, There are few OpenBSD 4.8 servers without compiler installed at my working place. However, sometimes there are some patches released for patching the kernel. My question is: Can I setup a non production OpenBSD 4.8 server as a test machine with compiler installed and use it to... (1 Reply)
Discussion started by: lcxpics
1 Replies

7. Shell Programming and Scripting

[Bash] Checklist

Hello! What is the script for a checklist? This: dialog --backtitle "Mesage" \ --title "Title" \ --checklist "Choose your favorite distribution:" 10 40 3 1 "RedHat" on 2 "Ubuntu Linux" off 3 "Slackware" off ??? I need help! (0 Replies)
Discussion started by: []Adri4n
0 Replies

8. Shell Programming and Scripting

Checklist for Shell Script reqd

Hi, Can anyone provide me with the Code Review Checklist for Shell scripts ?? Thanks in advance. (2 Replies)
Discussion started by: Shivdatta
2 Replies

9. Solaris

Checklist on Preventive Maintenance

Hi all, I'm a new SUN Engineer and will be doing some Preventive Maintenance next month. What should I do? Is there any standard procedure to be followed? Thank You (4 Replies)
Discussion started by: frankoko
4 Replies
Login or Register to Ask a Question