Backup and Recovery

A formal backup process is really an insurance policy. You will invest in the technology and time to back up systems resources so that in the event of a problem you can rebuild the same system configuration as before, with some reliability.

Building a Backup Strategy

To determine a backup strategy there are three areas that require consideration:

  1. What information should be backed up
  2. How data should be backed up (that is, what backup technology should be used)
  3. When and how frequently should backups occur
Each element requires analysis and trade-offs when considering the options available and the particular environment, such as:

System Data versus User Data

System data makes up the operating system and its extensions. This data should always be kept in the system file systems, namely / (root), /usr, /tmp and /var.

User data is data which the users need to complete their specific tasks. This data should be kept in /home or in any file systems that are created specially for users. User programs and text should not be placed in file systems designed to contain system data.

If the user data is kept separate from system data, it is easier to manage backups. In general, a backup of user and system data is kept in case data is accidentally removed or in case of a disk failure. When doing backups for these reasons, you should back up the system data separately from the user data.

There are two reasons for keeping data separate from user data:

Backing Up Your System

This procedure describes how to make an installable image of the root volume group for restoring your system, if necessary, or installing another system (cloning). It works with the mksysb command.
smit startup
smit mksysb
mkszfile && mksysb /dev/rmtX.Y
are different ways of calling the mksysb program to do a complete backup of all the jfs mounted file systems in the rootvg volume group. If you have more than one volume group (you can use lsvg command to check), you must use one of the backup commands (backup, tar, pax or cpio) to back up the other volume groups, since mksysb only backs up the rootvg volume group.

The mksysb backup makes a bootable tape that can be used to completely rebuild your system or to recover one or more files. It backs up everything in the rootvg volume group, including the ODM database. The ODM contains device configuration data.


**** NOTE: **** The mksysb command retains the IP address of the system being backed up. Therefore, if you are installing another system, you must change the IP address of that system before connecting it to the network.

Passwords are also restored from a mksysb tape. This can create security problems.


Before Using the mksysb Command

Before using the mksysb command you have to decide if you will keep the user file systems of rootvg mounted.

User File Systems of rootvg Unmounted

Note the following when backing up the system with unmounted user file systems:

User File Systems of rootvg Mounted

Note the following when backing up the system with mounted user file systems:
**** NOTE: **** File systems that are mounted across a network using NFS are never backed up by the mksysb command.

If a local directory is mounted over another local directory, this procedure will back up the files twice. Therefore, you should unmount any local directories that are mounted over another local directory.


Other Considerations

Other considerations about the mksysb command follow:

Backup System Procedures

This section discusses some points about the backup and restore procedures of the whole system.

Backup

This procedure is described in the AIX Installation Guide. Here is the explanation of the two commands mkszfile and mksysb:

Restore

This procedure is described in the AIX Installation Guide. However, some notes follow:

Common Configuration and Setup Errors

Here are some problems encountered in certain circumstances. It can be helpful in case of a failed mksysb backup or restoration.

Backing Up Your User File Systems

Before backing up a file system, you have to choose the type of backup you will use.

Type of Backup

You can choose four aspects of the backup from the list of eight shown below. Each of the four aspects is chosen from two alternatives shown within one of the rows in the table below.


Figure: Terminology of Backup Types

Incremental Backup

This procedure backs up all files changed since the last level backup of the file system.

Procedure

Each leveled backup will capture all files to the n-1 level backup.


# backup -0 -u /
# backup -4 -u /

The level -4 backup will capture all files for the root(/) file system that have been modified since the level -0 backup.

Use the -u flag when doing an incremental backup. Specify the -Level parameter to ensure that information regarding the last date, time, and level of each incremental backup is written to the /etc/dumpdates file.

Example




Figure: Incremental Backup

Stacked Backup

To make a backup on a stacked tape you have to use the special files associated with a tape.

Special Files for tape drives

Special files associated with each tape device determine which action is taken during open or close operations. These files also dictate, for applicable devices, at what density data is to be written on the tape.

There are two density settings in AIX. You can see and change the density settings by typing smit chgtpe. Here are some possible values for the density settings of tape drives:

The following table shows the different extension number you can use with these special files and their corresponding characteristics:


Figure: rmt Special Files


**** NOTE: **** The values of density setting #1 and density setting #2 come from tape drive attributes that can be set using SMIT. Typically density setting #1 is set to the highest possible density for the tape drive and density setting #2 is set to a lower density, but density settings are not required to follow this pattern.

The density value (bytes per inch) is ignored when using a magnetic tape device that does not support multiple densities. For tape drives that do support multiple densities, the density value only applies when writing to the tape. When reading, the drive defaults to the density at which the tape is written.

Most tape drives use 512 byte block size. The 8mm tape drive uses a minimum block size of 1024 bytes. If SMIT is used to lower the block size, it will waste space.

For more information about the tape see the System Management Guide for AIX Version 3.2.


Example

The stacked backup can cause undesired results if you are not meticulous and positive of the position of the tape.

If you need to make multiple backups on one tape, you should leave the tape in the tape drive. However, if you need to take the tape out, here are the steps to follow to add backups to the end of a tape.

This example is with an rmt0.1 high density, no rewind and no retension tape, used to make the backup of four users directories.

Backup Automation

AIX V3 provides facilities to automate any repetitious tasks such as backups. This can be done by using facilities such as cron or at commands to start the backup process at specified times.

If backup commands require a few steps to work correctly, it could be a good idea to incorporate the backup and associated commands in a shell script. For example, to perform a daily backup see Scripts.

Usage of SMIT

Use the smit backup command and follow the menus. Use this procedure for backing up single and small file systems by name, such as /home on your local system. Before using the SMIT method, note the following:

Safety of Backups

If you want to back up file systems that may be in use, you should unmount them first to prevent inconsistencies. If you attempt to back up by inode a mounted file system, a message is displayed. The backup command continues, but inconsistencies in the file system may occur. This warning does not apply to the root file system.

The command fuser can help you when you have problems running the command unmount. This command fuser is used to determine what processes are currently using a particular file system in order to stop the processes. The system administrator can then unmount the file system and make a file system backup. The option -u gives the name of the user that is using it, and the -k option immediately kills all processes using it.

It is important that you back up data and file systems that are usable, that is to say not corrupted. If you back up corrupted data, you may be in an unrecoverable situation. Here are some techniques that can be used to check for corruption:

Backup Log

You should establish a backup log and tape labelling systems to track all backups and backup media. When backups are made log entries should be made noting the time, date, and content of the backup. A suggested logging and labelling scheme is as follows:

Recovery Test

You can always check the table of contents for the backed up files with the command restore. See Restore your Backup for more details.
# restore -Tvf /dev/rmt0

After you have set up a functioning backup strategy, it will be important to test recoverability. You should ensure that your recovery process and backups do what you want, which is to provide a way of recovering from a disaster. You should run a simulated disaster recovery exercise about one to two months after your system was established. You should use this exercise to test that your strategy works. If a recovery exercise reveals minor or major failings you should adjust your strategy accordingly.

Restore your Backup

You can use the restore command to restore the data backed up with the backup command, or simply to list it.

Other Backup Commands

There are several other commands that create backups and archives. The backup and restore commands are unique to IBM.

We are not the only vendor that has unique commands. There are only a few backup commands that AIX, System V** and BSD have in common. They are dd, cpio, and tar.

Here are the backup commands you can use on an AIX system:

backup
Backs up files by name or by i-node number.
cpio
Copies files into and out of archive storage.
dd
Converts and copies a file.
tar
Manipulates archive backups.
rdump
Backs up files by i-node onto a remote machine's device.
pax
Creates and extracts tar and cpio command archives.

You can get more information about these commands in InfoExplorer.

Backup Strategy Example

Let's work with a system called pippin as an example for a backup strategy. The system pippin features two major applications. The accounting application LH and office applications. The system pippin has only one volume group: rootvg.

We must summarize the file systems key characteristics as the first part of our backup planning process, such as in the following table:


Table: File Systems Information for Pippin

Backup Strategy

We can now start to formulate our backup strategy. We have only one high critical file system, which is /lh/data. It is clear that we should back these files up as frequently as possible. This generally means a daily backup. We have a large number of medium critical file systems. These are all generally software, but they also include the data in user's files in /home. We should back up this information at a regular interval. Weekly backups are the next frequency level used.

Why select weekly backups? The rationale is to keep the backup regimes simple and easily integrated with normal business processes and cycles. After a daily cycle, weekly is the next greater cycle that most organizations use. The next cycle would be monthly and we could use that cycle also. In this example we have chosen weekly because users have their spreadsheets and word processing data held on the system and a loss of three weeks of changes would be considered unreasonable as people probably could not remember what they had done.

This raises the point that some of the information in /home may in fact be critical to individual users. This being the case we can implement a scheme where users can, during the day, back up their own files if they feel that is critical. The smit backup command would allow them to do this.

Finally we have two low critical file systems: /oa and /tmp, which we can back up at monthly intervals or not at all.


**** NOTE: **** This backup strategy doesn't deal with whole system backups in detail but of course you should take care of backing up your system with the mksysb command whenever you can, especially before major changes or updates of the system.

Backup Implementation for Pippin

It is important to note here that the choice of an 8mm tape drive makes the design of our backup scheme much simpler. The 8mm tape can hold over 2GB on a single tape.

To back up the critical data in /lh/data we will need to back up its contents daily. You will note that 20% of this file changes per day, and that it is 300MB in size. This means that daily backup will be of the order of 60MB if we only back up changed items, or 300MB if we back up all data. If we had a 150MB capacity 1/4" tape, we would find that we are forced to use incremental backup. This is because we may probably want to run an unattended daily incremental backup, but would need to run an attended backup to back up all of /lh/data (300MB or about two tapes).

The remainder of the system accounts for about 372MB of information that we will need to back up weekly. The low priority file system /oa is only 20MB. So, even though it is low priority, we should add it to the weekly backup of the other major file systems. This will simplify the backup scheme, and with the 8mm tape we have plenty of space on the tape. Alternatively we could not back it up at all and rely on the original installation media. If you wish to err on the side of safety, add it to the weekly backup.

In summary our backup strategy will be:


Table: Backup Strategy, File Systems and Frequencies

From among these major backup types we need to select the most appropriate commands. For our example we have selected the following:


Table: Backup Strategy, Commands to Use

This strategy minimizes the number of tapes and subtleties involved. There are weekly and daily backups that back up everything they need to.

Scripts

The first script, Figure - Daily Backup Script for Pippin, details a script that will be run daily to back up the accounting data that has changed since the previous day. The second script, Figure - Weekly Backup Script for Pippin, will back up the entire system using an mksysb process and then do a level 0 incremental backup on the accounting data.

Daily Backup of /lh/data




# daily.backup
#
# This script will back up /lh/data on a nightly basis.
# It can be executed via cron or from the command line
# Requires root authority to function correctly
# Set permissions to be -rwxr--r-- user=root group=system
#
# This script does not include handling of common error conditions
#
# This script does not include logging of progress and logging of
# errors. It is recommended that this be added for a production script.
#
# Set up Tape Unit as a variable, tell people what's about to happen.
#
TAPE=/dev/rmt0
#
wall 'The accounting application will be shutdown for backup in 1 minute'
#
sleep 60
#
# fuser will kill processes using /lh/data
fuser -k -u /lh/data
#
# unmount the directory so it cannot be accessed and check it
umount /lh/data
fsck -p /lh/data
#
# back up all files changed since last full backup
backup -1u$TAPE /lh/data
#
# backup completes .... remount
mount /lh/data
#

Figure: Daily Backup Script for Pippin

Weekly Backup Script for Pippin

To perform a weekly backup on pippin we will use the mksysb script and then stack the level 0 backup of the data at the end of the tape. We do this to ensure we are using just one tape. Having the /lh/data separate also helps speed recovery from corruption or loss of /lh/data.


# weekly.backup
#
# This script will back up the system using mksysb.
# It will then back up /lh/data at level 0.
# It can be executed via cron or from the command line.
# Requires root authority to function correctly.
# Set permissions to be -rwxr--r-- user=root group=system
#
# This script does not include handling of common error conditions.
#
# This script does not include logging of progress and logging of
# errors. It is recommended that this be added for a production script.
#
# Set up Tape Unit as a variable.
#
TAPE=/dev/rmt0
TAPENR=/dev/rmt0.1
#
wall 'The system will be shutdown for backup in 1 minute'
#
sleep 60
#
# Kill all processes and bring the system down into maintenance mode
shutdown -m
#
#Unmount the file system /lh/data so this file system is not in the mksysb
umount /lh/data
# Run mksysb
mkszfile && mksysb $TAPE
#
# Position head at end of data just written (4 separate files)
tctl -f $TAPE rewind
tctl -f $TAPENR fsf 4
#
# Check and backup /lh/data
fsck -p /lh/data
backup -0u$TAPENR /lh/data
mount /lh/data
#
# Reboot the system
shutdown -rF
#

Figure: Weekly Backup Script for Pippin

Recovery

Recovery is the process of applying the most recent backup material to the system in the event of some form of failure. It is recommended that you run a complete system recovery exercise as soon as possible to ensure that the recovery process works. The recovery process should be documented in the system log (see Backup Log). By way of example, the restoration processes for pippin would be as follows: