Just when you thought you knew how to design a backup system -- everything changes. A new category of disk array products promises to revolutionize how backups are performed. However, before describing how they work, and what they cost, it's appropriate to explain the problems that they're trying to solve. And it's important to review why backups are performed, and why they are usually sent to tape.
Everyone knows that the only reason we perform backups is to be able to perform recoveries. Recoveries are done for three reasons:
To restore damaged files, file systems or individual systems to their point of failure. Most real-life recoveries aren't done because a data center has been destroyed, or because someone needs a file that was deleted over a year ago. Most recoveries are performed because someone inadvertently deleted an important file, a RAID array was damaged or a database administrator accidentally dropped the wrong table.
To restore a damaged data center to its last available off-site backup. Although recent events have increased the number of people that have actually performed a disaster recovery, most people only worry about having to perform such a recovery. Obviously, the only viable option to recover from a disaster is to store a copy of all backups off-site.
To restore a file, file system or system to an earlier point in time. Sometimes data is corrupted or damaged, and a significant amount of time passes before anyone notices
For a long time, tape drives and tape libraries have been the only acceptable method to accomplish all three purposes of backup. Optical media's (e.g., MO, CD, DVD) cost per megabyte is usually too high to compete with tape in most environments. Ditto the cost of SCSI disks. Only tape meets all of the criteria mentioned next.
Tape is permanent enough for long-term storage. Because tape is so economical, you can store data that probably won't be needed. And because tape libraries can be expensive, they can be filled again and again with tapes that cost approximately one-tenth the cost of magneto-optical media. Tape can be easily shipped between locations. The most common method of disaster recovery preparation is to create an extra copy of each backup tape and ship it off-site.
The downside of tape
Given the unique capabilities of tape, it's still the best method for accomplishing the disaster recovery and archival purposes of backups. However, these types of restores don't comprise the bulk of restores. Most restores could be better handled by disk -- if it were only cheap enough.
Using tape for backups is sometimes a challenging proposition. Many of us have grown use to these challenges, even to the point of pretending they don't exist. For example, tapes are now too fast. Even without compression, the 9840B (19/38 MB/s), AIT-3 (15/30 MB/s), LTO (15/30 MB/s) and Super DLT (11/22 MB/s) drives are all too fast to stream using a single Fast Ethernet connection. In order to stream them, many drives support multiplexing.
Multiplexing lets you send multiple backup jobs to a single tape drive simultaneously. These multiple data streams are then multiplexed, or interleaved, onto the tape to supply the tape drive with enough throughput to keep it streaming. However, multiplexing hurts restore performance. Restoring one stream of data that has been interleaved with other streams of data doesn't allow the tape drive to stream, significantly slowing a restore.
Tape-to-tape copying takes too much time. Due to details beyond the scope of this article, copying from tape-to-tape isn't easy - it's difficult to keep two tape drives streaming simultaneously, when one is the source and the other is the destination for a copy. The result is that many people don't make off-site copies.
|Sampling of disk backup products|
There are a number of companies that manufacture large, multiterabyte, SCSI, and Fibre Channel addressable, ATA/IDE-based arrays, and there is at least one software manufacturer making a product specifically designed to utilize these arrays for backups. The following is a sample of some of these products (in alphabetical order).
Most people know that the number of tapes required for a restore is directly proportional to the length of time in between full backups. The longer you wait to perform a full backup, the more tapes you're going to need to perform a complete restore. Of course, the more tapes you need, the chance is greater that one of them will fail - and ruin the entire restore. This is why many people perform weekly full backups, even though the media costs are considerably higher.
Another common problem is the way incremental backups work with many backup software packages. An incremental backup of a large file system may run for over an hour, supplying only a few hundred megabytes of data. This, of course, makes it impossible to stream the tape drive. And even though dynamic drive sharing software - such as Legato's DDS or Veritas' SSO - allows the sharing of a tape drive between multiple servers, these programs don't permit writes to the same tape drive simultaneously from multiple servers.
As mentioned previously, a single bad tape can cause a large restore to fail. The more tapes your backup resides on, the chance is greater that a single tape will cause a restore to fail. And, of course, you never know if a tape has failed until you need it - probably one of the biggest disadvantages of tape over disk.
ATA/IDE disk arrays
Disks can solve the limitations of tape mentioned above, but server-class SCSI disk drives cost too much to use as a backup device in most environments.
However, someone realized that SCSI disk drives aren't the only game in town, and ATA/IDE disk arrays were born. Ranging from $8,000 to $10,000 per terabyte, ATA/IDE arrays (see "Sampling of disk backup products") are inexpensive when compared to other disk arrays - approximately one-third to one-fourth of their cost. When comparing these arrays to tape libraries, you must include the price of the tape library and its accompanying media. The robot is often the most expensive part of a tape library, and the more slots the robot can manage, the less it costs per terabyte. Consequently, you will find prices ranging from $10,000 per terabyte for smaller libraries, all the way down to $3,000 per terabyte for larger libraries.
On the software side, almost any backup package is capable of backing up to tape; however, some have better overall solutions if you are going to use disk. For example, Legato NetWorker's disk staging feature is nice, and BakBone, NetVault's tape virtualization feature -- which uses disk like it's tape -- works well, too.
What to do?
OK, here's the punch line: Suppose you need to store about 3TB in your on-site tape library, and you need room for about 1TB of off-site copies. Instead of buying a 4TB tape library, purchase a 3TB disk solution and a 1TB tape library. Make all your on-site backups to disk and leave them there. Just as you would make on-site backups to tape and allow the tapes to expire and be overwritten, you will do the same with the virtual tape library on disk. All on-site recoveries then come straight from disk. There are no tapes to swap, and no robots to be repaired - just a fast virtual tape library.
For off-site and archival purposes, copy each night's backups from the virtual tape library to the real tape library. Those tapes are then ejected and stored off-site for archival restores, to be used in case of a disaster that destroys your virtual tape library. This virtual tape library system has a number of advantages over a traditional library. It doesn't require a constant data stream. Disk drives can go as fast, or as slow as you need them to. The drives don't need to be multiplexed - since you don't need to stream them, you don't need to multiplex them.
These drives can be shared among servers. Some of these arrays let you create as many virtual tape drives as you could possibly need, allowing each backup client to be given their own virtual tape drive. Disk drives are quicker to copy from. Unlike copies from tape-to-tape, copying from disk-to-tape allows the tape drive to easily stream at its maximum throughput, since it's a local copy coming from a random access device.
There seems to be some controversy about whether disk-to-disk restore is really as fast as people think. During a restore operation, a disk will be faster than tape simply because the disk does not have to be loaded, fast-forwarded and skipped through. The load/fast forwarding accounts for 30 to 250 seconds per tape in a restore, depending on the drive type. The skipping through has to do with multiplexing. Since people tend to send several backups to one tape drive simultaneously, the restore of a single backup from that tape must read data, skip data, read data, skip data and so on. First, you probably won't need to multiplex to disk, so there will be no need to skip/read/skip. Second, even if you did, the skip/read/skip would be lightning fast because it's disk.
Disks don't require media management. There are no tapes to load. And, last but not least, you don't have to perform full backups as frequently. A virtual tape library might actually give you greater capacity than a similarly sized tape library. That's because performing full backups less often doesn't increase restore time, or decrease the integrity of your backup.
The Gartner Group says that backups are still the most expensive storage application. Maybe that's about to change. Imagine the time and money you could save if you only had to swap tape for off-site backups. The possibilities are limitless.
About the author:
W. Curtis Preston is the president of The Storage Group. He is the author of Unix Backup and Recovery and Using SANs and NAS.