Hardware Storage Components

This chapter is intended to overview the capabilities and functions of the hardware storage devices available to AIX. This will help to provide a context for understanding more clearly the rationale behind some of the storage management policies and the options and features of storage management software, as well as enabling more informed decisions to be made when selecting hardware devices.

Selecting the Hardware Components

Selection of the components that will make up the storage subsystem ranges from very easy to being a complex trade-off. First of all, the application requirements need to be considered in terms of their storage necessities, the mix of storage device types then needs to be decided, and finally specific products need to be chosen.

Points to Consider

Selecting the correct products for inclusion into a storage management subsystem involves consideration of a number of points:

How to Make the Decision

The decision as to which storage device is best for a given environment can be represented using a Venn diagram as in Figure - Simple Storage Component Selection.

The Simple Case

This example shows the case where the decision is not complicated; examination of the points discussed in Points to Consider has placed the solution completely into one of the shaded areas:


Figure: Simple Storage Component Selection

It is not always this easy however, and requirements often generate a case for more than one device type.

Things Become More Complicated

In these examples, the decision points in Points to Consider have resulted in requirements for more than one device type, as can be seen in Figure - Requirements Suggest Several Components.

Sometimes the decision can be even more difficult though.

The Worst Case

The worst case is not terrible in the sense that it is a disaster, but it does mean that the requirements are complex enough to merit selection of devices of all three types (see Figure - Complex Storage Component Selection). Some of the reasons that may have led to these decisions include:


Figure: Complex Storage Component Selection

The chart in Figure - Summary of Device Attributes summarizes the advantages of each of the device types.


Figure: Summary of Device Attributes

Selection of the correct devices for the storage subsystem does, therefore, involve careful consideration of the types of information that will be stored, as well as the access requirements. Once the correct mix of devices has been selected, it is then necessary to choose specific devices from within each device class. Within each class, the basic functions of the device (disk, optical or tape) can be implemented in a number of different ways, with different costs associated, and different levels of function provided. Selection of the actual device to be used is also a matter for consideration, and the following sections endeavor to assist in this process by discussing the features, functions and technologies involved in each class.

Selecting the Physical Hardware Devices

This section will look at the different physical implementations of the various devices available, and explain some of the advantages and disadvantages involved in their selection.

Hardware Attachment Adapters

As has been mentioned in the previous chapter (see Hardware Management), the adapter forms the primary interface between storage devices and the rest of the computer system. The adapter is responsible for communicating instructions and data between a controlling process and the storage device. There are a number of different adapters available, each utilizing differing communications protocols, and each with subsequent pros and cons. Each storage device also supports different combinations of adapters, so it is well worth understanding the difference in order to enable sensible decisions to be taken.

There are five main things to consider when looking at adapters:

  1. Cabling requirements

    Every adapter technology places limitations upon the length of cable supported between the adapter and a device, and then device to device (if supported). This can have implications in terms of the amount of disk that can be attached for example (the physical sizes of the devices may be greater than the cable length allowed for connection). The size of the cable may also cause problems if routing though ducting is necessary.

  2. Performance/Reliability

    The maximum sustainable and burst data transfer rates govern how fast information can be sent to and retrieved from the devices. This has implications in the number and type of devices that can be attached to a particular adapter. The reliability of the technology will also affect performance (some methods are less error prone than others).

  3. Addressability

    This governs both how many devices can be physically attached to an adapter, as well as the type of device. Some adapter technologies allow attachment of multiple systems; this means that more than one processor can share storage devices using this mechanism.

  4. Device support

    Obviously, the adapter selected must be capable of supporting the devices required for attachment now, but consideration should also be given to future requirements, as well as range of devices supported (some standards are more open than others).

  5. Cost

    Both the cost of the adapter and the average cost of devices supporting attachment to the adapter should be considered. Some technologies are more expensive than others.

The following sections look at the various adapter options available.

Small Computer System Interface Adapter

The Small Computer System Interface or SCSI is one of the most common mechanisms for attaching both IBM* and non-IBM peripherals. SCSI originated from the selector channel on IBM System/360* computers, and was later scaled down by the Shutgart Associates Company to make a universal, intelligent disk drive interface. After around four years of discussions, in 1986, SCSI became an ANSI standard, expanded to support other kinds of devices as well.

This standard, now referred to as SCSI-1, allows a maximum of seven devices to be attached, and provides a one byte wide parallel bus. Each attached device has a unique address to allow the operating system to communicate with it. Data can be transmitted either synchronously, or asynchronously, depending upon the capabilities of the device used; both asynchronous and synchronous devices can share the same SCSI bus, and in fact, all devices must start up in asynchronous mode initially to enable this compatibility. Asynchronous transfer rates are typically around 1 to 2.5MB per second, while synchronous devices can communicate faster, from 4 to 10MB per second. This original standard defined optional synchronous clock speed of up to 5MHz, giving a maximum data rate of 5MB per second on the one byte wide bus.

With the release of the SCSI-1 standard in 1986, work started on a new standard, predictably called SCSI-2. This standard is still in the process of being officially approved as an ANSI standard, though many vendors, including IBM, have implemented most of the features in the draft standard. Among the new features are the following improvements.

Downward compatibility is maintained so that SCSI-1 devices can be attached to SCSI-2 buses. SCSI-1 will be used when communicating with these devices, and SCSI-2 to devices on the same bus supporting the new functions.

There are two alternative electrical configurations possible for the SCSI-1 and SCSI-2 standards.

  1. Single Ended

    The single ended interface comprises a ground and single signal line for each of the SCSI data and control functions. This is the simplest configuration possible, but is prone to electrical interference, and therefore has recommended cable lengths of from three to six meters.

  2. Differential

    The differential interface comprises positive and negative signal lines for each of the data and control functions. The binary value of the transmitted signal is determined from the difference between the voltages of these two signals. Interference will affect both signals equally, hence not changing the difference between them; this provides for far more reliable communication, with correspondingly greater cable lengths of up to 19 meters allowed.

There is a third SCSI standard currently being discussed, not surprisingly known as SCSI-3. This standard will provide for even higher data rates, larger numbers of addresses, and greater cable lengths between devices. This will be possible through the utilization of serial buses and packetized protocols. New media will be supported, such as fiber optics, twisted pair, or even wireless.

SCSI also supports the attachment of multiple processors to the SCSI bus, which allows implementation of device sharing.

High Performance Disk Drive Subsystem Adapter

The High Performance Disk Drive Subsystem adapter provides for attachment of up to four serially attached disk subsystems, each of which may address up to four disk devices, giving a total addressability of 16 devices. The distances between adapter and subsystem can be up to 10 meters using copper twisted pair cables. The serial link uses full duplex packetized communications to the disk subsystems, and can support a maximum total data transfer rate of 80MB per second.

High Performance Parallel Interface Adapter

The High Performance Parallel Interface (HiPPI) adapter provides an ANSI standard parallel interface to other computers and to storage devices. The adapter provides simplex or duplex point to point communication at burst data rates of up to 800Mb per second (in each direction) over copper cable at distances of up to 25 meters. This cable distance can be extended using fiber optic extenders or OEM HiPPI switches. The adapter consists of three cards on the RS/6000*, requiring five slots for power consumption reasons; only one is installable per Micro Channel* bus. These points limit the environments that the HiPPI interface can be used in as they restrict the possible configurations of the system.

ESCON Channel Adapter

The ESCON channel adapter supports the transfer of data between an RS/6000 and the ESCON channel at a maximum rate of 17MB per second. The adapter supports connections over fiber optic links using LED or LASER technologies. The link between control units, directors and systems can be up to three kilometers with LED technology, and up to 20 kilometers using LASER.

System/370 Channel Emulator Adapter

The System/370 channel emulator adapter provides parallel channel attachment capability via the block multiplexor channel, and supports data transfer at rates of up to 4.5MB per second. The block multiplexor channel cable length can be up to 61 meters in length, and up to four control units can be supported.

Serial Storage Architecture

Serial Storage Architecture, or SSA, is an emerging standard defining a new connection mechanism for peripheral devices. The architecture specifies a serial interface that has the benefits of more compact cables and connectors, higher performance and reliability, and ultimately, a lower subsystem cost. A general purpose transport layer provides for 20MB per second full duplex communications over 10 meter copper cables. Devices are connected together in strings, with up to 128 nodes (devices) allowed per string. Information is transmitted in 128 byte frames that are multiplexed to allow concurrent operations. In addition, the full duplex communications allows simultaneous reading and writing of data.

Other Adapters

There are a number of devices that can be connected via local area networks. In the main, these devices support some form of networking protocol, such as TCP/IP and NFS, that allow the computer system to access the device as though it were local. In these cases, the computer system would be attached to the LAN using a token ring or Ethernet adapter.

Disk Storage

This section will look at disk technology generally, before going on to examine the decision process necessary to select the correct disk subsystems for the environment.

Disk Technology

All disk devices are constructed in basically the same way (see Figure - Anatomy of a Disk Device). A number of disk platters fixed to a central hub are rotated at high speed by a motor. Both surfaces of each platter are coated with a thin film of magnetic material where the data will be stored. Information is written to and read from the magnetic surface via small read and write heads that are located at the end of mechanical arms known as actuators The actuators move the heads back and forth from the outer edge of the platters to the inner edge. Data is written to the disk surface in concentric tracks , so the movement of the actuator locates the head over the required track, and the rotation of the platter moves the track past the heads allowing information to be read or written. Each platter surface has read and write heads associated with it, though all heads are usually attached to the same actuator assembly, moving in concert, with data read from one platter surface and one track at a time. The length of time it takes for the actuator to move the head to the required track is known as the seek time, while the time taken for the rotation of the platter to bring the correct part of the track under the heads is known as the rotational latency. Once correctly positioned, data is read or written in a continuous stream, and the rate at which this occurs is called the data transfer rate. The combination of the averages of seek time, rotational latency, and data transfer rate define the performance of the disk device.


Figure: Anatomy of a Disk Device

The technology used for the read and write heads, as well as the composition of the magnetic material used on the platter surface, defines the areal density or how much information per unit area can be stored on the disk device. The earliest mechanisms used small coils that generated a magnetic field when current was passed through them, thereby changing the magnetic polarity of a small area of the disk; the polarity of the area defining the binary value stored. Passing the coil back over the surface causes it to intersect with the magnetic fields generated by each area. When a magnetic field moves through a coil, it causes a current to flow, the direction dependent upon the polarity of the field, thus allowing the information to be read back. Greater areal densities and hence correspondingly greater capacities per drive have since been achieved with new head technologies, such as thin film heads which utilize the coil principle, and more recently magneto resistive or MR heads (used for reading only). MR heads use a different principle for reading the state of the magnetic domains, sensing the variation in electrical resistance of the platter surface rather than using induction. This also gives much improved performance.

Locating the information and passing it to the requesting processor is accomplished by electronics in the drive assembly. The design and capabilities here also affect the overall performance of the device. In the simplest case, requests arrive for information located at a particular disk address, so the actuator is positioned at the correct track and the required number of blocks are read and passed to the requestor. The next request arrives, the actuator is repositioned and the request again fulfilled. This involves much seeking back and forth, as well as waiting for the correct parts of the track to arrive, thus introducing significant delay. In order to minimize these periods of inactivity, some devices utilize a mechanism known as elevator seeking where the incoming requests are sorted so that the actuator can fulfill a sequential series of requests on an inward pass, and then again on the outward pass. This minimizes seek delay. Some devices also utilize a mechanism called read ahead whereby the remaining blocks in a track (after a read has been satisfied) are read and cached locally in the device in anticipation of a sequential request. If this occurs, the information can be supplied directly from cache with no costly seek or rotational latency delays.

One other hardware technique used is known as banding. This takes advantage of the fact that if data is written at a fixed rate, then there will be larger gaps between bits the further out from the center that writes occur. Banding therefore partitions the disk into radial sections and raises the bit density as the heads move outwards through them. This ensures an even data density, and consequently increases the overall capacity of the device.

Disk devices vary enormously in their capacities and performance, with the highest single drive capacities currently around 4GB.

Selecting the Correct Disk Storage Devices

The preceding section explained the basic technology utilized in disk storage devices; this section will focus on selecting the right solution in terms of the drives and subsystems currently available. There are three main considerations (as outlined in Hardware Management), performance, availability, and capacity. In some situations, fault tolerance, or the ability to continue in the event of component failure, is important. This consideration is really a subset of availability, but is separated out in Table - Application Requirements for Disk Storage for clarity. This table provides a guide for various application types as to which of the above attributes are required, and should therefore assist in the selection of the correct storage hardware.


Table: Application Requirements for Disk Storage

:2 refid=idiskd.application requirements

Having looked at which elements are important on a per application basis, requirements for each attribute can now be examined in terms of specific device type selection.

  1. Capacity

    If increased internal disk storage is required, then the choice is restricted by the number of internal drives supported within the system itself, and then by the capacities of the drives selected. The following table shows the maximum capacities for each of the currently available systems.


    Table: Maximum Internal Storage Capacities

    Internal disks will generally be attached to a SCSI adapter, and may well share the bus with other slower, asynchronous devices such as tape. This will affect performance. Furthermore designing a highly available solution using only internal disks can be difficult, as the maximum number of drives allowed is not high. If either of these points is important, or just that more capacity than supportable internally is required, then the considerations become slightly more complex. In terms of maximizing storage capacity, Table - Maximum External Storage Capacities and Table - Maximum External Storage Capacities (continued) show the maximum sizes for all drives and subsystems that can be attached to the RS/6000.


    Table: Maximum External Storage Capacities




    Table: Maximum External Storage Capacities (continued)

    This gives an indication of the capacity that can be expected when selecting particular devices, but should be used in conjunction with the sections on performance and availability before making any final decisions. Furthermore, it must be noted that multiples and mixes of these subsystems and devices can be attached, though there are limits on the total due to technology constraints.

    The following table shows the maximum numbers of external storage devices that can be attached to a Micro Channel bus.


    Table: Maximum External Storage per Micro Channel

  2. Performance

    With regard to the disk devices themselves, the major performance issue is application related; that is to say, whether large numbers of small accesses will be made (random), or smaller numbers of large accesses (sequential). For random access, performance will generally be better using larger numbers of smaller capacity drives; the opposite applying for sequential access. If the overall capacity requirements are large however, then larger capacity disk drives should be used, as there will be sufficient drives to enable performance benefits to be gained from concurrent access. Individual disk drive performance information can be found in Table - Individual Disk Drive Characteristics.


    Table: Individual Disk Drive Characteristics

    Generally speaking, when performance is the major issue, the best approach is to benchmark the application set. If the applications spend most of their time waiting for data from disk, then much benefit will be attained from selecting faster storage subsystems. The quickest access to data can be achieved through concurrency, which means being able to read/write from/to multiple disk drives simultaneously in order to satisfy an application request. This functionality is provided with RAID (Redundant Array of Independent Disks) support, which a number of disk subsystems can utilize. Actual performance characteristics will vary from subsystem to subsystem, but in the main, the following points hold true for each of the RAID modes of operation.


    1. RAID 0

      RAID 0 is also known as data striping. Conventionally, a file is written out to (or read from) a disk in blocks of data. With striping, the information is split into chunks (a fixed amount of data) and the chunks written to (or read from) a series of disks in parallel. There are two main performance advantages to this.

      • Data transfer rates are higher for sequential operations due to the overlapping of multiple I/O streams.
      • Random access throughput is higher because access pattern skew is eliminated due to the distribution of the data. This means that with data distributed evenly across a number of disks, random accesses will most likely find the required information spread across multiple disks and thus benefit from the increased throughput of more than one drive.

      RAID 0 is well suited for program libraries requiring rapid loading of large tables, or more generally, applications requiring fast access to read-only data, or fast writing. RAID 0 is only designed to increase performance, there is no redundancy, so any disk failures will require reloading from backups.

    2. RAID 1

      RAID 1 is also known as disk mirroring. In this implementation, duplicate copies of each chunk of data are kept on separate disks, or more usually, each disk has a twin that contains an exact replica (or mirror image) of the information. If any disk in the array fails, then the mirrored twin can take over. Read performance can be enhanced as the disk with its actuator closest to the required data is always used thereby minimizing seek times. The response time for writes can be somewhat slower than for a single disk, depending on the write policy; the writes can either be executed in parallel for speed, or serially for safety (see Logical Volume Manager Policies for a complete explanation of mirroring policies). This technique improves response time for read-mostly applications, and improves availability at the cost of price (twice as many disks as disk space required are required).

      RAID 1 is most suited to applications that require high data availability, good read response times, and where cost is a secondary issue.

    3. RAID 2/3

      RAID 2 and RAID 3 are parallel process arrays, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks and each chunk written out to the same physical position on separate disks (in parallel). When a read occurs, simultaneous requests for the data can be sent to each disk, which then retrieve the data from the same place and return it for assembly and presentation to the requesting application. More advanced versions of RAID 2 and 3 synchronize the disk spindles so that the reads and writes can truly occur simultaneously (minimizing rotational latency buildups between disks). This architecture requires parity information to be written for each stripe of data, the difference between RAID 2 and RAID 3 being that RAID 2 can utilize multiple disk drives for parity, whilst RAID 3 uses only one. If a drive should fail, the system can reconstruct the missing data from the parity and remaining drives. Performance is very good for large amounts of data, but poor for small requests as every drive is always involved, and there can be no overlapped or independent operation.

      RAID 2 is rarely used, but RAID 3 is well suited for large data objects such as CAD/CAM or image files, or applications requiring sequential access to large data files.

    4. RAID 4

      RAID 4 addresses some of the disadvantages of RAID 3 by using larger chunks of data and striping the data across all of the drives except the one reserved for parity. Using disk striping means that I/O requests need only reference the drive that the data required is actually on. This means that simultaneous as well as independent reads are possible. Write requests however, require a read/modify/update cycle that creates a bottleneck at the single parity drive. This bottleneck means that RAID 4 is not used as often as RAID 5, which implements the same process, but without the bottleneck.

    5. RAID 5

      RAID 5, as has been mentioned, is very similar to RAID 4. The difference is that the parity information is distributed across the same disks used for the data, thereby eliminating the bottleneck. Parity data is never stored on the same drive as the chunk that it protects. This means that concurrent read and write operations can now be performed, and there are performance increases due to the availability of an extra disk (the disk previously used for parity). There are other enhancements possible to further increase data transfer rates, such as caching simultaneous reads from the disks, then transferring that information whilst reading the next blocks. This can generate data transfer rates at up to the adapter speed. Similar to RAID 3, in the event of disk failure, the information can be rebuilt from the remaining drives.

      RAID 5 is best used in environments requiring high availability and fewer writes than reads.

    Disk subsystems that support RAID include:

    To summarize, the key performance issues are listed below:



  3. Availability and Fault Tolerance

    The key assessment with regard to availability is how severe the impact of losing data (however temporarily) would be to the business. If, for example, being without access to vital information for two hours would cause unacceptable loss of business, then the system must be designed in such a way that any failures can be remedied within this period. Standard availability usually means that the system is designed in such a way as to minimize the risk of failure, but not prevent it altogether; choosing highly reliable devices for example. High availability generally implies introducing some redundancy into the system design, so that the system can continue (albeit usually at reduced performance) while the failing component is replaced. If all vital components in a subsystem have a back up in case of failure (total redundancy), then the system is fault tolerant; this means duplication of all critical components, including power supplies and cooling fans for example, as well as allowing replacement of failing parts during continuing subsystem operation.

    The price of redundancy for high availability or fault tolerance is usually the increased cost.

    Highly availability solutions include mirroring (RAID 1), as well as RAID 3 and RAID 5 parity. Automatic recovery can be built into some of the RAID supporting subsystems as well. Additionally, some subsystems allow redundancy in power supplies, controllers, and cooling to provide fault tolerant, highly available subsystems.

The various advantages and disadvantages of the disk devices and subsystems available are summarized in Table - Comparison of Disk Device and Subsystem Features, which can be used to compare disk solutions, and select the most appropriate for the required environment.


Table: Comparison of Disk Device and Subsystem Features

Tape Storage

This section will look at tape technology, before continuing to examine the decision process necessary to enable the best tape subsystems for the environment to be selected.

Tape Technology

There are two basic technologies incorporated into tape devices, and the specifics of these will be discussed shortly. Both though, utilize the same essential mechanism for writing and reading data: however it is packaged, and whatever materials are used in its construction, tape consists of a long strip of material ranging from 4mm wide, to half an inch. The strip is coated (in much the same way as disk) with a magnetic material, and wound onto spools of some kind. Using a transport mechanism dependant on the technology, the tape is moved past read and write heads that utilize similar technology to those used in disk devices to alter or sense the polarity of magnetic domains on the tape, thereby writing or reading data.

It is at this point that the technologies differ, both in the methods used for tape transport, and in the way in which the data is written onto the tape surface.

  1. Helical Scan


    Figure: Helical Scan Principles

    Helical scan technology has its origins in consumer analog video devices, and though there are a number of different formats, the basic principles are the same in each case. As can be seen from Figure - Helical Scan Principles, the tape surface is wound around a large cylindrical head inclined at an angle of some four to five degrees. The tape moves relative to the head, which is itself spinning at high speed. This results in data tracks written at an angle across the tape width as well as being slightly overlapped. This makes very efficient use of the tape capacity, and gives a good data rate for continuous writing of data (streaming). This capacity is at the cost of start/stop performance, as synchronization problems slow down the initial access. Additionally, helical scan is a destructive process in the sense that the tape surface is in contact with the read/write head and hence wears more rapidly. Tape is normally contained within a cartridge and extracted to be wound around the head as shown in Figure - Helical Scan Tape Paths. This winding process also takes time and must be performed every time the device is loaded or idle, as tape cannot be left in contact with the head for too long as this would again cause excessive wear. Head replacement is also difficult, due to the complexity of the transport mechanism.


    Figure: Helical Scan Tape Paths

  2. Longitudinal Recording

    Longitudinal recording was specifically designed for computer data storage. Again there are a number of variations, though all utilize the same basic ideas. As can be seen from Figure - Longitudinal Recording Principles, the tape is moved past stationary read and write heads causing the data tracks to be recorded linearly along the tape's length. In order to make full use of the tape, the heads normally contain multiple elements allowing several tracks to be written or read concurrently. In addition, when a continuous series of tracks has been written along the length of the tape, the direction of motion can be switched, and the heads stepped perpendicular to the movement of the tape, thereby allowing another series of tracks to be written. This process can be repeated until the entire tape width is used, and is known as serpentine track interleaving.


    Figure: Longitudinal Recording Principles

    Longitudinal recording is a non-destructive process with a consequently longer media life. Performance is good for both streaming and start/stop activity, and the data rate is high. Maintenance is a simpler process, and as can be seen from Figure - Longitudinal Recording Tape Paths, the tape transport path and mechanism are generally simpler.

    Two types of spooling method are common. Cartridges similar to helical scan cartridges can be used, though with longitudinal recording, the tape transport path can remain entirely within the cartridge. This makes load and unload operations much faster, and the entire design much simpler. The other mechanism utilizes a single reel within the cartridge, and requires the free end of the tape to be threaded onto a spool within the tape device itself. This does result in a slightly more complex design, and consequently longer load and unload times.


    Figure: Longitudinal Recording Tape Paths

The simpler design of longitudinal devices generally results in greater reliability, though for a given media size, helical scan will provide greater capacity. Start/stop performance and load/unload times are also better with the longitudinal technology.

Both helical scan and longitudinal recording devices can make use of hardware compression before writing data to the tape. In some cases (with the latest Intelligent Data Recording Capability, or IDRC), this can result in up to a fourfold increase in capacity, depending upon the characteristics of the data to be compressed. Currently, maximum capacities for both technologies are at around 5GB per cartridge without compression.

The latest longitudinal devices now have such rapid load times, which when coupled with new recording strategies, give access times to data anywhere on the tape that are beginning to enter the acceptable range for interactive use. The next section will look at specific product types with a view to selecting the best tape devices for the environment.

Selecting the Correct Tape Storage Devices

The preceding section has looked at tape devices from the technical point of view. This section will now examine the criteria that should be used to choose the correct devices for an environment, as well as the devices available. As in the case of disk devices, there are three main considerations, performance, availability, and capacity.

  1. Capacity

    The capacity of a tape drive refers to how much information can be stored on the media that it uses. This varies as a function of the tape drive technology, and the compression techniques used (see Tape Technology for details). If the required capacity should exceed that of any single tape available, and either time constraints exist, or there is a requirement for unattended backup, then a tape library should be used. The various capacities available from the individual tape devices are shown in Table - Comparison of Disk Device and Subsystem Features. As can be seen, capacities are generally higher for the devices using helical scan technology. Tape Libraries are discussed at the end of this section.

    Most tape drives support some form of compression which can increase the amount of data that can be stored on a tape. The degree of compression depends upon the data to be compressed, so the figures shown in Table - Tape Drive Specifications are the maximum ratios. In order to arrive at a maximum compressed capacity for a particular tape, the ratio should be multiplied with the uncompressed capacity (for example a 5GB tape with a compression ratio of 2:1 would give 10GB of data). The tape device automatically uncompresses the data when reading back from tape. There is a small overhead involved (small because the compression is usually performed in hardware). As some types of data do not benefit greatly from compression, and to remove the small overhead, most devices allow compression to be turned off if required. Generally though, it is of benefit to leave compression enabled.

  2. Performance

    In the case of tape drives, performance mostly refers to the data rates to and from the device. This is usually not limited by the attachment mechanism, but by the device itself, though there are a number of adapters supported (see Table - Tape Drive Specifications). In the case of tape libraries, the time taken to read the first byte of data is usually also a performance measurement, and includes the time taken to load and unload tapes from/to the library; this is discussed at the end of this section. Data rates and attachment methods are detailed in Table - Comparison of Disk Device and Subsystem Features. Note that although the data rates are generally comparable, start/stop performance is usually superior with longitudinal technology.

  3. Availability

    Availability in this sense usually means reliability, and can be measured as the Mean Time Between Failures for the device (MTBF), and the reliability of the media. As was mentioned in Tape Technology, media life is greater for longitudinally recorded tapes, with correspondingly fewer errors; additionally, the simpler transport mechanisms employed normally extend the MTBF significantly.




Table: Tape Drive Specifications

Tape libraries have been described in Tape Technology, and generally utilize automation to load/unload one of the drives in Table - Tape Drive Specifications from a library. The library management software must usually be provided by an application and needs to be written to understand the interface to the library. It is important therefore, to confirm that the tape library selected is in fact supported by the applications required. The intended usage is important too. If the library will be used for backing up fileservers, or workstation clients overnight, then it is necessary to ensure that the data rate is sufficient to do this. A comparison of current tape library products can be seen in Table - Tape Library Specifications.


Table: Tape Library Specifications

Optical Storage

This section will look at optical storage technology, before going on to examine the decision process required to choose the right optical storage devices for the environment.

Optical Technology

There are three basic optical technologies, each providing for different capabilities. Some devices will cope with more than one kind of technology, though this is not always so. In all three cases, information is stored in tracks on the media surface. Each mechanism then uses different methods to read and write information on these tracks.

  1. Compact Disk Read Only Memory

    With Compact Disk Read Only Memory, or CD-ROM, information is molded into the media during the manufacturing process as a series of pits in the surface. The existence or absence of a pit determines the binary state at each point. The information is read back by shining a laser onto the disk surface and measuring the reflected intensity. The compact disk is spun at high speed inside the device and the laser assembly moved radially in and out to give access to the required blocks of information. CD-ROMs are cheap to make and provide an excellent distribution medium. The only potential issue is that there is no recording capability.

  2. Rewritable

    Rewritable media uses magneto-optic technology to store information. As can be seen from Figure - Rewritable Optical Media Technology, the media surface is comprised of concentric tracks of magnetic material. With this technique, the read/write head consists of two components, an electromagnet and a laser. The media is first prepared by heating each magnetic domain with a high powered laser in the presence of a magnetic field. This causes the domains to adopt a common polarity. Writing is accomplished by again heating up a domain with the laser while applying a magnetic field of reverse polarity to flip its state. Reading back information is achieved using the reflected light from a lower powered laser to detect the original polarity domains (zeros) and the reversed polarity domains (ones), through a polarization effect. Erasing is implemented by again heating up the domains with a high powered laser and simultaneously using the electromagnet to reset the polarity to its original state.

    While providing the benefits of read, write and erase, this technology still manages to be very stable, giving a shelf life of around 10,000 years, and archival life of around 150 years.


    Figure: Rewritable Optical Media Technology

  3. Write Once Read Many

    Write Once Read Many, or WORM technology is also implemented in a number of different ways. The purpose in each case is the same, to allow information to be recorded for permanent copy, that is to say once written, it cannot be erased.

    WORM media is also stable, giving the benefits of extremely long archival life of over 500 years, with the additional advantage of allowing the initial information recording.

  4. Multifunction

    A fourth technology combines the capabilities of WORM and magneto-optical to give drives that support both functions.

These optical technologies also come in a number of different form factors including 5.25 inch and 12 inch media, both single and double sided. The optical disk is normally housed in a cartridge, and typical capacities per cartridge are currently around 1.3GB for double sided, double density, 5.25 inch media.

Currently, lasers operating in the red light frequency range are being used; in the future, switching to blue light frequency range lasers will increase the density fourfold and consequently enlarge capacity. The binary state is currently read as a function of the position of the element on the media (known as Pulse Position Modulation, or PPM). In the future, Pulse Width Modulation, or PWM will be used. This is also known as edge detection, where a state change is interpreted as one level, and no state change as the other (binary 1 and 0). This allows up to a 50% increase in density (see Figure - Pulse Position Modulation Vs Pulse Width Modulation), with a corresponding increase in capacity. Lastly, banding as described in the section on disk technology will be used, the overall capacity increase including banding being around 24 times.


Figure: Pulse Position Modulation Vs Pulse Width Modulation

Selecting the Correct Optical Storage Devices

Having looked at the technology used in optical storage devices in the previous section, this section will now discuss the selection of the physical devices. Once again, performance and capacity are important differentiators, and should be considered; in addition, with optical media, the technology used is also important.

  1. Capacity

    The capacity of optical media is mainly dependent upon the technology, but within that, also varies with the number of bytes per sector supported. It is important to ensure that the application that will be using the optical device supports the sector size that gives the capacity required. The capacities (for the different bytes per sector supported) are shown in Table - Optical Device Specifications. If the capacities available are insufficient for the environment, then an optical library should be considered; these are discussed at the end of this section.

  2. Performance

    With optical devices, performance involves (similar to disk) a combination of the read and write data rates, as well as access times (themselves a combination of seek times and rotational latencies). The attachment method will not affect the performance significantly, as the maximum data rates do not come close to the data rates of the adapters; this does mean that optical devices can share adapters with other devices without performance implications (the total data rate should be calculated to be below the maximum adapter data rate). The performance characteristics of the various optical devices available are compared in Table - Optical Device Specifications.

  3. Technology

    As was described in Optical Storage, there are several different optical technologies currently in use: CD-ROM, WORM, and Rewritable. If the requirement is for distribution or reading only, at relatively low data rates, then CD-ROM is adequate; higher data rates would require WORM or rewritable media devices. WORM is ideal if there is a requirement for long term storage of infrequently accessed information, or for distribution. Rewritable media is more suitable for interactive use when performance is not critical.




Table: Optical Device Specifications

Although generally slower than disk, optical storage is cheaper. Therefore, if there is a requirement for large amounts of secondary storage, and performance is not critical, then optical storage should be considered. If the storage capacity required exceeds that of the optical drives available, then a library should be considered. Using similar technology to that employed in tape libraries, optical libraries utilize automation to load/unload optical drives from magazines of optical media. The same performance, capacity, and technology considerations apply as in the case of optical drives; similar to tape libraries though, library management software needs to be provided that understands how to control the optical library. A comparison of the optical libraries currently available can be found in Table - Optical Library Specifications.


Table: Optical Library Specifications

In addition to the directly attachable library devices shown in Table - Optical Library Specifications, there are also a number of LAN attached optical libraries that can be utilized via NFS. These libraries utilize the same technologies as discussed in this section, the difference being in the method of access. See File Systems for a discussion of NFS. The models of the 3995 that can be attached in this fashion are:

The capacities are the same as for the equivalent x63 models for direct attachment.

Summary

This chapter has discussed in more detail the hardware components available for use by AIX storage management products.

The first section looked at the considerations involved in choosing the types of components appropriate for a particular environment:

Various decision scenarios were examined, including:

The second section examined the characteristics of the hardware devices themselves with a view to selecting from the various products currently available.

Adapters were looked at from the points of view of:

Disks were looked at from the points of view of:

Tape devices were looked at from the points of view of:

Optical devices were looked at from the points of view of: