Help in understanding how backup and restore works in any organization?


 
Thread Tools Search this Thread
Operating Systems AIX Help in understanding how backup and restore works in any organization?
# 1  
Old 01-22-2015
Help in understanding how backup and restore works in any organization?

Please take your time to answer/comment. no urgency. it would help upcoming sysadmins like me in understanding how things work in real time.

OS: AIX
Middleware: Weblogic/WAS
Database: Oracle DB/IBM DB2
Backup s/w tools: not available as of now (except native OS commands/utilities)

I'm a newbie (jr SA), I knew little about the theoretical information (like commands) from Redbooks etc. But would like to know more from real time experienced professionals.


for instance;

As an AIX Sys Admin, we are supposed to backup AIX LPAR.
Meaning,
OS: rootvg (OS filesystems)
other software: like middleware/db s/w (uservg)
App: Application data (uservg)

* I understand that DB2/Oracle have their own native backup tools. But I am talking in terms of OS System Admin responsibilites.

I feel like, as an OS admin I need to make sure system(all components) is integrated all the time.

Here are my questions

1) How to Backup whole AIX LPAR (it may be App Server or Database server or fileserver) ?

I understand that mksyb(rootvg) and savevg(uservg) commands are available. Please provide your comments/procedure from based on your experience. (like How organizations implement stuff)

2) How to restore both OS and USER VG ?

scenarios
a) How to restore only 1 or 2 system config files (example crontab file) from backup?
b) how can we restore only OS (AIX) from particular time ?
c) how can we restore few application data files ?
d) how to restore App/DB data from last one month ?
e) how we can restore whole LPAR in case of system crash (need to get back to 3 days old state)?


And I believe that, some Oracle DB and WAS installations update OS filesystems. like they have files saved onto /home directory of their user /home/orcldir.

3) How can we make sure that both OS data and APP data are in sync after restore?

*Please answer all above questions by considering below things
A)if I have backup s/w like TSM or Netbackup tools in my environment.
B) another scenario " we do not have any backup s/w tools like TSM/NetBackup etc"

Please provide your comments/procedure (high-level steps) based on your experience.

Thanks in advance.
# 2  
Old 01-22-2015
There are many constraints for backups and therefore many different strategies. Every backup solution is a trade-off between many such constraints: time, effort, money, feasibility, risk assessment, .... Here are some pointers, no completeness is intended at all:

1) What do you need the backup for?

There are many different reason you take backups: you want to restore the base system (after a hardware crash or after a misconfiguration). For this you use the "mksysb" command (and usually a NIM server as destination). You might want to restore application data. These come in two basic classes: transaction data and fixed data. When you install the DB software all the binaries, configuration files, etc. are fixed and you need a new backup only when something changes. The transaction data (the DB content, archive logs, etc.) you need to back up more or less permanently.


2) How long may a restore take?

Suppose your hardware is blanked out completely and you start over to install/restore from scratch. How long do you have time until the system has to be back up no matter what? Depending on the answer to that question you need to have different strategies. If the answer is "1 week" a USB-attached tape drive might suffice and you might not bother to back up the OS completely. You could always install from scratch and maintain copies of some vital config files. If the answer is "3 hours" you will not be able to install and configure the system anew but need automated restore mechanisms as well as potent backup connections to restore fast. Maybe your answer is in between and hence the reasonable equipment is in between these two extremes too.


3) Develop several scenarios and test your environment against it

You might think a backup is about restoring files, but this is not the case always. Here are some scenarios, off the top of my head, there are a lot more:

A - restore the OS from scratch or duplicate it to a new machine

B - restore a vital application file of a given size

C - put back the database into the state it had 2 days ago at 11:25 am.

D - restore the database to its latest state (suppose the storage failed).

For each of these scenarios ask yourself: will your solution/environment be able to do it? How long will it take: under optimal conditions? Under adverse conditions? How much effort will be involved? Does the user have to be able to do it himself or should he rely on (one from the) the sysadmin(-team)?

I hope this helps.

bakunin
These 2 Users Gave Thanks to bakunin For This Post:
# 3  
Old 01-22-2015
Thanks for the reply bakunin. your response shed light on some of the stuff which i always wanted to know. now i am seeing a bigger picture. Smilie
Thanks for your time.

regarding my questions;

I understand that,
usually DBA's help us in bringing the database/application to current state (what ever possible using archive logs & transactions logs etc)
They have backups running all the time (incremental as well as full backups - db level)


As an AIX Sys Admin,
I can capture the OS image (rootvg) before making any chnages like upgrade/migration etc. [am not talking about small canges, where we can take file backup using cp command]
and other than that, i can schedule a script which captures mksysb of AIX LPAR every week.

In my scenario, If we do not have any backup tool/products.

How can we backup user data (not OS) on regular basis using OS commands ? incremental or full backups ?


Here am planning to test below on some test server.

a) backup the OS using mksysb on weekly(sunday) basis to a file/directory (for OS) - am aware of this
b) backup all other vg's (user data) on regular basis ?
c) Database level , anyway DBA's can help with their db-level backups (fixed & transaction) - am aware of this

d) Finally if have all these backup strategies in place, how can i bring the system to current state in case of crash ?

OS - i can use mksysb (last sunday)
USER/APP files - ?

I would like to restore OS, user data and all db-level restores to make the system run as before.

Please let me know if this is right way to do it.

Thanks all
# 4  
Old 01-22-2015
it maybe a lot of work without any tools but for this question:

How can we backup user data (not OS) on regular basis using OS commands ? incremental or full backups ?


Since you are taking a mksysb and hopefully on a remote server, depending on what the situations are like, allocate a new LUN make this a backup drive, use something like rsync to make the backups for you. I'm not really sure this will do incremental backups though.

Here is some info from rsync:
Code:
     Rsync copies files either to  or  from  a  remote  host,  or
     locally  on  the  current  host (it does not support copying
     files between two remote hosts).

This User Gave Thanks to techy1 For This Post:
# 5  
Old 01-22-2015
The key to a successful strategy is - like always - planning. Let us do an example server with some exemplary requirements:

The first thing you want to plan is the layout of the VGs and filesystems so that data are already roughly separated into groups:

1) OS - data needed to (re-)create the system in case of disaster

2) fixed application data (binaries, config files, ....) similar

3) user data: changing regularly but you can afford to lose a single version

4) transaction data: changing permanently and you need to restore to any point in time. DB files, ...

For 1) you have mksysbs. In any sizable environment with more than 5 LPARs you should have a NIM-server to store the mksysbs to. Keep the size of the mksysb image as small as possible by excluding everything you do not need to back up: "/tmp" for instance, is per definition expendable. You may or may not exclude "/home" (see 3). The smaller your image is the faster you can restore the base system and the faster you can take this system backup. It may make sense to include data falling under (2) here instead of having them in their own category.

Decide on a sensible frequency: one week may make sense, but if your installation is stable and you do not change it you may even have longer cycles. In principle you need a new mksysb every time your config changes, however long that is. You should take backups as often as necessary and as seldom as possible. Which frequency is this tradeoff of minimal effort and costs and maximal security is a decision you have to make.

Put procedures in place (like scripts to take the backups, validate them, scripts to restore them, ....) and always measure the time necessary to execute them. You need to get some estimation for responsibly setting up disaster recovery strategies.

You need also to plan for test restores to validate your procedures as well as the backups you took. In fact i have often seen that backups were taken only to find out that they can't be restored in case you need them. Usually this gets you in real deep kimchi. So test, over and over again, whatever you do. Further, you need to train recovery procedures: you do not want to really do it for the first time when the disaster has already struck. In a state of emergency you want to do only routine things so make sure your restore procedure becomes routine to you.

For 2) - if you haven't included them in 1) - you apply the same rules and questions as for 1, but you may need a separate step because the restore may be only possible once the base system has been restored. You can use "savevg" or maybe a FS backup with "tar"/"cpio", "rsync" as mentioned by techy or even other methods. Ask yourself: will it suffice to restore a complete image or may you need single files from the backup? May it be an option to re-install from the original media instead of taking backups?

Only when you have clarified your requirements you should decide upon the method which best fulfills them. Again: planning is vital!

Now for 3): depending on how much data users keep on your server and how many revisions they can afford to lose you need to plan your backup strategy. Notice that for a backup to work you need to restrict access to the data - always. A typical value is that users can afford to lose one day of work therefore you take backups every day of their data. Depending on the volume and the time you have for restores you can do full- and/or incremental backups.

The full backup will take the longest time to take and produce the biggest amount of data but the restore will be the fastest. Once you have one full backup you put any number of incremental backup generations until you take a full backup again. Incremental backups only consist of the difference to the last backup, therefore you need to restore the last full backup, then one incremental backup after the other until you arrive at the point in time you want to get at. The more generations of incremental backups you have the longer the restore might take, but the backup process is the faster the less your data changes. Also the volume of the backup is (considerably) smaller than the full backup. Notice that the more the data changes the less sense an incremental backup makes.

A typical solution (which may or may not be good in your case) might be to take a full backup every week and incremental backups every day. Notice that there are two possible ways to do an incremental backup: you can do it on file level (backing up every file which has changed since the last backup as a whole) or sub-file differences. The former you can do with OS commands and scripts (use "find" to the list of files, than backup them) the latter you need specialised software.

Now for the last part: the (DB) transaction data. You need a DB-file backup and how often you need to take that depends on the amount of transactions the DB experiences. You need to discuss this with the DBAs. For a DB file backup (aka "full backup") you need the DB to be offline. As above this is the slowest, most voluminous and most interrupting form of backup but the fastest overall to restore. Most commercial DBs have provisions for that (i.e. Oracles RMAN) with hooks to popular backup software (Legato Networker, TSM, ...) and you can use these procedures in your scripts. You still should put scripts into place to automate the procedures you build and minimise necessary personal effort in case of emergency (scripts tend to stay calm when disaster strucks and the stress level is high, unlike humans).

Once you have decided upon DB file backups (or "cold" backups, as they are also called) you need to backup transaction logs (Oracle calls them "archive logs"). This is an ongoing job and usually you have a filesystem for these logs with a FS sensor (a script running permanently): once the FS is filled more than some threshold value the archive logs are backed up and then deleted from the system to make room for the next archive logs.

Establish restore mechanisms, test them and also note the time it takes to put the DB into a selected state. Restore works basically like this: you restore the last cold backup, then restore the archive logs takes since then and "roll forward" the database (reapply these archive logs transactions) until you reach the point in time you want to get at. The more archive logs you have to roll forward the longer the process takes. It is a management decision how long it may take. The shorter the feasible time is the more full backups you need or the better hardware you need. This can get as complicated as you wish, with parallel backup/restore sessions, data staging techniques like VTLs, flash copies, hot database backups, hot standby DB instances and what not.

I hope to have shed some light on the principles but to give you any concrete procedure your questions are just too general and you described your requirements not detailed enough. You may think in light of what i have written and as more specific questions you have come to some decisions and once you know your requirements better.

I hooe this helps.

bakunin
These 3 Users Gave Thanks to bakunin For This Post:
# 6  
Old 01-23-2015
Thanks very much for the reply bakunin. Appreciate your time and help.

This is useful in understanding what to consider etc. Am glad that I was able to get valuable information from professional/experts like you.

Your post is really helped me a lot in understanding the things which i was trying to explore.


Thank you. Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Help understanding how swap works

Hi all, I'm trying to get an understanding of swap. This is what I see: Memory: 8192M real, 1697M free, 5693M swap in use, 10G swap free My question(s): If I have available RAM, why am I using swap? Or am I reading this wrong (been known to happen). Thanks much, ~K (8 Replies)
Discussion started by: kuriosity_prime
8 Replies

2. Red Hat

Backup / Restore

Hi, I need to back up a RH file system (96G). The files are oracle .dbf format some of which are 5G in size. I know that tar has got a size restriction of 2G so I cannot use this. Can anyone recommend an alternative way of backuping up this FS? I have been looking at dump but this... (6 Replies)
Discussion started by: Duffs22
6 Replies

3. Solaris

Zpool backup and restore

hi, my requirement goes something like this: In current setup, we have SPARC server running Solaris10 5/08. Out of 3 HDD available, 2 HDD (other than root) were zpool-ed and 3 zones were created. Now, we have bought a new server with similar H/W config and planning to move the zones... (1 Reply)
Discussion started by: EmbedUX
1 Replies

4. AIX

Backup and restore

Hi experts, i got a question. i have a production server with two Volume Group(VG) which are rootvg and datavg. Both of these VGs are 256 PP SIZE. On Disaster Recovery Server (DR server) contains two empty hardisks for restoring rootvg and datavg from production server. This two hardisks are... (7 Replies)
Discussion started by: polar
7 Replies

5. Solaris

Full backup and Restore

Dear All ; first how are you every body I'm just subscribed in your forum and i hope i found what i searched for along time . I'm not a Solaris specialist but i read more to build a Network Management Station depends on Solaris as OS and it is working good now . my problem is how to perform... (16 Replies)
Discussion started by: Basha
16 Replies

6. HP-UX

Backup Tape Restore?

I am trying to do a restore on a backup tape (DDS2) and am having a little trouble. For one, I dont know how the tape was made, whether is was tar, cpio, dump..etc. Anyone know how to restore a tape without knowing the format of the backup? (5 Replies)
Discussion started by: bake255
5 Replies

7. HP-UX

F-Backup restore

Hello! i have a blank harddrive and a complete tape backup of the workstation. the backup is made with F-Backup. Now my question is: how can i restore my workstation? thanks for every idea! paul tittel hup-si (3 Replies)
Discussion started by: paultittel
3 Replies

8. AIX

Backup and restore

I have several H80 machines, all with AIX 4.3.3. On these machines I have mksysb running for rootvg backups and savevg for non-rootvg backups. I'm trying to get a list of files on the tapes, but I can't seem to do it with tar for the mksysb images. I keep getting the directory checksum errors?... (3 Replies)
Discussion started by: uXion
3 Replies

9. Solaris

Backup / restore

Hi.... everyone could help me to understand how to do a backup of my servers .. operating systems is sun solaris 8 . I have some question about .... 1) Is better backup phisical disk or partition ??? i sow the command is ufsdump 0cfu /expbck/bcksunver/c0t0d0s5 dev/dsk/c0t0d0s5 to... (4 Replies)
Discussion started by: tt155
4 Replies
Login or Register to Ask a Question