What's better for backup: tape or disk?

What's better for backup: tape or disk? Both, actually, and here's why. The answers may surprise you.

This article can also be found in the Premium Editorial Download: Storage magazine: Comparing the top data backup packages:

Recently, an increasing number of clients have been asking me about replacing tape with disk in some--or all--of their backup infrastructure. Until recently, the cost of tape was far less per unit of storage than any other media. It is easily transportable, making off-site storage possible. In addition, newer tape technologies have increased both speed and capacity dramatically. So why do people want to replace it?

Typical complaints about tape focus on four areas:

  • Performance. Because it's serial, tape is perceived--rightly or wrongly--to be slower than disk.
  • Reliability. Everyone who has dealt with backups has experienced tape failure: they wear out, get mishandled or just go bad.
  • Handling. Removing cloned tapes from a library, adding scratch tapes and cataloging bar codes--tape handling can be a nightmare, especially given the ever-increasing number of tapes.
  • Cost. Tape media costs and tape off-site storage costs are budget line items that continue to soar.
However, upon closer analysis, one finds that these really aren't tape problems per se. They are backup management problems: people/process/policy problems that are the result of inadequate backup architectures, lax operational procedures and poor data policy management. While some characteristics of tape can exacerbate these problems, tape itself is not the root cause.

So, if you're experiencing one or more of these problems regularly in your environment, don't bet on disk as the panacea for backup.


Disk-based backup options
DEVICE TYPE CHARACTERISTICS EXAMPLE
Low-cost disk subsystems Traditional SAN or NAS-based storage arrays Network Appliance NearStore, StorageTek BladeStore, Nexsan InfiniSAN D2D
Virtual tape devices RAID disk arrays that emulate a tape library Quantum DX30, Neartek VSE2
Content-addressable storage systems Highly scalable, multiple low-profile units in a Redundant Array of Independent Nodes (RAIN) architecture EMC Centera, Avamar Axion

Disk choices
Using disk as a backup device isn't really a new concept. In the mainframe world, virtual tape systems utilizing disk drives as a cache and staging area for tape devices have been in widespread use for a number of years. In open systems, disk has been involved with backup in at least three ways. First, business continuance mirroring products such as EMC's TimeFinder have played an increasingly important role in improving backup performance and offloading production servers. Also, some backup products, such as IBM Tivoli Storage Manager, have utilized the concept of intermediate disk storage pools on backup servers as the target repository for nightly backups, thereby reducing contention for tape drives. Finally, virtually all backup products support backup to a disk device instead of--or in addition to--tape. In all of these cases, however, disk has been used as a temporary repository--the backup data ultimately ends up on tape.

Recently, a number of new products have emerged based on low-cost storage devices, most notably ATA disks. These and other devices are presenting new options that provide some interesting options for designing a backup architecture. These new devices tend to fall into one of three categories: low-cost disk subsystems, virtual tape devices and content-addressable storage systems (see "Disk-based backup options," this page).

Low-cost disk subsystems are probably the most recognizable type of device. These are typically ATA-based systems that share many characteristics with traditional SCSI and Fibre Channel (FC) disk systems. The major differentiating factors between these and their higher-priced siblings are performance and cost. They are positioned in the market as near-online storage devices. Because their cost per megabyte is approaching tape, and they provide some level of RAID protection, it has become feasible to consider their use in backup environments. Additionally, some vendors offer features such as replication and/or snapshot capabilities to further enhance their capabilities.

Incorporating such a device into a backup environment would most likely require the use of traditional backup software, or as in the case of Nexsan, vendor-supplied software, with the disk system configured as the target device. Off-site copies would be handled by either device-to-device replication or by using the backup application to make tape copies.


Disk backup fuels dramatic capacity increase
PRIMARY
STORAGE (GB)
DISK-BASED
BACKUP (DBB)
TO PRIMARY
STORAGE RATIO
ANNUAL
STORAGE
GROWTH
RATE (%)
YEAR 1
REQUIRED DISK
STORAGE (GB)
YEAR 3
REQUIRED DISK
STORAGE (GB)
  PRIMARY DBB PRIMARY DBB
1000 10:1 50% 1500 15000 3375 33750
1000 5:1 50% 1500 7500 3375 16875
1000 1.2:1 50% 1500 1800 3375 4050

A virtual tape system (VTS) is perhaps the simplest to incorporate into an existing environment. Because it emulates a tape library, it should be no more difficult that adding a new tape device. The functionality of the VTS is highly dependent on the backup software being used. You would still require traditional tape devices to create off-site volumes. Also, it must be pointed out that the VTS systems currently available for open systems environments haven't yet reached the functionality and maturity of those found in the mainframe world.

One concern with both low-cost disk systems and VTS systems is the amount of disk needed to support an environment (see "Disk backup fuels dramatic capacity increase," this page). While this can be partially offset by a reduction in tape media purchases and storage costs, it still appears that tape continues to hold a cost advantage over traditional disk-based solutions.

A third approach to low-cost disk storage is content addressable storage technology. These systems typically consist of a large number of low-profile servers each with two to four ATA disks. They have several particularly interesting features:

  • The servers are clustered in a Redundant Array of Independent Nodes (RAIN). Usually many servers fit in a single cabinet. Like RAID arrays, RAIN protects against failure through redundancy.
  • Data that's written to these devices is cataloged by content, using a hashing algorithm based on the data itself. Therefore, when a piece of data is received that is the same as one that has been previously cataloged, there's no need to write another copy to disk.
  • Systems are self-healing-when one node or disk fails, data is automatically replicated to another healthy node.
  • Replication can be within a single frame or to other local or remote frames providing interesting options for disaster recovery.
  • Because multiple copies of the same data don't need to be stored, the ratio of backup data to primary data in these systems is more like to be approximately two times or less, rather than five to 10 times with standard disk backup.

Tape or disk?
How does one decide if disk makes sense for a particular environment, and if so, which type of disk storage to use? Here are some guidelines:

  • The key advantage to disk is faster restore time.
  • Modern tape devices can backup large files such as databases, as fast as or faster than disk. In environments with many small files, disk should have an advantage.
  • Tape is highly transportable. It's usually easy to send tapes anywhere that they are needed for recovery.
  • Introducing disk devices demands new backup procedures as well as likely reconfiguration of backup software.
  • Buying and storing more tapes is usually easier than adding disk capacity.
Can you completely eliminate tape from your backup environment? Is disk the future of backup? Disk-based solutions are in their infancy. Products will continue to mature, and more new technologies are on the horizon. While tape will continue to advance and play an important role, for many environments, disk will become a major component of backup in the coming years.
This was first published in March 2003
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close