Recently, an increasing number of clients have been asking me about replacing tape with disk in some--or all--of their backup infrastructure. Until recently, the cost of tape was far less per unit of storage than any other media. It is easily transportable, making off-site storage possible. In addition, newer tape technologies have increased both speed and capacity dramatically. So why do people want to replace it?
Typical complaints about tape focus on four areas:
- Performance. Because it's serial, tape is perceived--rightly or wrongly--to be slower than disk.
- Reliability. Everyone who has dealt with backups has experienced tape failure: they wear out, get mishandled or just go bad.
- Handling. Removing cloned tapes from a library, adding scratch tapes and cataloging bar codes--tape handling can be a nightmare, especially given the ever-increasing number of tapes.
- Cost. Tape media costs and tape off-site storage costs are budget line items that continue to soar.
So, if you're experiencing one or more of these problems regularly in your environment, don't bet on disk as the panacea for backup.
Using disk as a backup device isn't really a new concept. In the mainframe world, virtual tape systems utilizing disk drives as a cache and staging area for tape devices have been in widespread use for a number of years. In open systems, disk has been involved with backup in at least three ways. First, business continuance mirroring products such as EMC's TimeFinder have played an increasingly important role in improving backup performance and offloading production servers. Also, some backup products, such as IBM Tivoli Storage Manager, have utilized the concept of intermediate disk storage pools on backup servers as the target repository for nightly backups, thereby reducing contention for tape drives. Finally, virtually all backup products support backup to a disk device instead of--or in addition to--tape. In all of these cases, however, disk has been used as a temporary repository--the backup data ultimately ends up on tape.
Recently, a number of new products have emerged based on low-cost storage devices, most notably ATA disks. These and other devices are presenting new options that provide some interesting options for designing a backup architecture. These new devices tend to fall into one of three categories: low-cost disk subsystems, virtual tape devices and content-addressable storage systems (see "Disk-based backup options," this page).
Low-cost disk subsystems are probably the most recognizable type of device. These are typically ATA-based systems that share many characteristics with traditional SCSI and Fibre Channel (FC) disk systems. The major differentiating factors between these and their higher-priced siblings are performance and cost. They are positioned in the market as near-online storage devices. Because their cost per megabyte is approaching tape, and they provide some level of RAID protection, it has become feasible to consider their use in backup environments. Additionally, some vendors offer features such as replication and/or snapshot capabilities to further enhance their capabilities.
Incorporating such a device into a backup environment would most likely require the use of traditional backup software, or as in the case of Nexsan, vendor-supplied software, with the disk system configured as the target device. Off-site copies would be handled by either device-to-device replication or by using the backup application to make tape copies.
A virtual tape system (VTS) is perhaps the simplest to incorporate into an existing environment. Because it emulates a tape library, it should be no more difficult that adding a new tape device. The functionality of the VTS is highly dependent on the backup software being used. You would still require traditional tape devices to create off-site volumes. Also, it must be pointed out that the VTS systems currently available for open systems environments haven't yet reached the functionality and maturity of those found in the mainframe world.
One concern with both low-cost disk systems and VTS systems is the amount of disk needed to support an environment (see "Disk backup fuels dramatic capacity increase," this page). While this can be partially offset by a reduction in tape media purchases and storage costs, it still appears that tape continues to hold a cost advantage over traditional disk-based solutions.
A third approach to low-cost disk storage is content addressable storage technology. These systems typically consist of a large number of low-profile servers each with two to four ATA disks. They have several particularly interesting features:
- The servers are clustered in a Redundant Array of Independent Nodes (RAIN). Usually many servers fit in a single cabinet. Like RAID arrays, RAIN protects against failure through redundancy.
- Data that's written to these devices is cataloged by content, using a hashing algorithm based on the data itself. Therefore, when a piece of data is received that is the same as one that has been previously cataloged, there's no need to write another copy to disk.
- Systems are self-healing-when one node or disk fails, data is automatically replicated to another healthy node.
- Replication can be within a single frame or to other local or remote frames providing interesting options for disaster recovery.
- Because multiple copies of the same data don't need to be stored, the ratio of backup data to primary data in these systems is more like to be approximately two times or less, rather than five to 10 times with standard disk backup.
Tape or disk?
How does one decide if disk makes sense for a particular environment, and if so, which type of disk storage to use? Here are some guidelines:
- The key advantage to disk is faster restore time.
- Modern tape devices can backup large files such as databases, as fast as or faster than disk. In environments with many small files, disk should have an advantage.
- Tape is highly transportable. It's usually easy to send tapes anywhere that they are needed for recovery.
- Introducing disk devices demands new backup procedures as well as likely reconfiguration of backup software.
- Buying and storing more tapes is usually easier than adding disk capacity.