Managing and protecting all enterprise data


Manage Learn to apply best practices and optimize your operations.

How disk has changed backup

Inexpensive disk has spawned a variety of disk-based backup alternatives. But with more choices comes greater complexity compared to the days when you simply had to choose a backup application and tape library. Backup guru W. Curtis Preston explains the advantages of using disk for backup, including virtual tape libraries and disk-as-disk backup targets, and discusses the pros and cons of alternative disk-based backup methods.

Our backup choices are much greater and more complex compared to the days when you simply had to choose a backup application and tape library. Disk has solved the reliability and performance issues that most storage managers have experienced with traditional backup systems, but disk-to-disk-to-tape (D2D2T) has complicated backup and restore processes. Users must now choose from three backup architectures and four types of disk-based backup targets. What follows are the pros and cons of each approach.

The traditional backup architecture
Before explaining how the different D2D backup options work, it's important to understand the backup systems they'll work with and why you might want to augment those systems with disk. In a traditional backup architecture (see "Traditional backup architecture", next page), software resides on the backup client (the server to be backed up) that allows the backup server to transfer that client's data to tape, disk or virtual tape. The data may be transferred across the network (LAN-based), from the client directly to tape/disk across the SAN (LAN-free) or directly from primary storage to secondary storage across the SAN (server-free). In each case, the data is converted into a different format that's understood by the backup software running the backup. This format could be tar, cpio, dump, NTBackup or a custom format understood only by that particular backup package.

  • If you only buy enough disks to hold a few nights' backups (i.e., disk caching), you'll speed up backups but won't speed up restores.
  • If you want to speed up backups and restores, you should buy enough disk to hold all onsite backups.
The biggest advantage of a traditional backup architecture is that it's well understood and has a solid, mature code base. The biggest disadvantage is how it uses tape. It's difficult for a traditional backup system to use the streaming nature of modern tape drives efficiently. To properly stream tape drives, some backup software products (like EMC Corp.'s Legato NetWorker and Veritas Software Corp.'s [now Symantec's] NetBackup) send multiple backup jobs simultaneously to the same tape drive, a technique called multiplexing or interleaving. This helps backups, but has a negative impact on the restore of a single backup; the backup software has to read the entire tape and disregard data it doesn't need. Other backup products, such as IBM Corp.'s Tivoli Storage Manager (TSM), solve the streaming issue with disk staging where backups are first sent to disk before they're sent to tape.

With the advent of lower-priced ATA-based disk arrays, however, everyone can take advantage of disk staging or disk-based backups without switching from a traditional backup architecture. Simply augment your tape library with a combination of disk and tape.

Disk backup options
There are four ways to add disk to a traditional backup system. The first two options are called disk-as-disk because they involve using disk drives behaving as disk drives--the disks aren't pretending to be tape. In a SAN disk-as-disk configuration (see "SAN disk-as-disk," this page), a disk array is connected to one or more backup servers via a SAN, and a disk volume is assigned to each server. Each server then puts a filesystem on that volume, and backups are sent to that filesystem. In a NAS disk-as-disk architecture (see "NAS disk-as-disk," this page), the disk resides behind a filer head that shares filesystems via NFS or CIFS, and backups are sent to those filesystems.

The last two options employ virtual tape libraries (VTLs), where disk systems are placed behind a server running software that lets the disk array pretend to be one or more tape libraries. "Standalone virtual tape library" (this page) shows standalone VTLs that sit next to a physical tape library and pretend to be another tape library. Once you back up to a standalone VTL, you must use the backup server to copy its backups to physical tape if you want to send them offsite. An integrated VTL (see "Integrated virtual tape library," this page) sits between a physical tape library and a backup server, where it pretends to be a physical library. The backup server backs up to the integrated VTL, which then copies the data to the physical tape portion of its library.

When backup software backs up to a disk-as-disk system, it knows it's a disk and typically creates a file within the filesystem. To distinguish these backups from those sent to a tape (or virtual tape) target, some people refer to these types of backups as filesystem-based backups.

Advantages of disk-as-disk targets
The biggest advantage disk-as-disk targets have over most VTL targets is price. Most disk-as-disk systems are priced significantly less per gigabyte than VTL systems because you're paying for the value of the VTL software.

It's possible to save more money by redeploying an older, decommissioned array as a disk-as-disk target. Decommissioned arrays are often end-of-life units without service contracts, so these contracts should be resumed if you're using the unit in a production system. (Service contracts on older equipment can be expensive; be sure to compare the cost of resuming the contract to that of a new system with a contract included.) Another advantage of disk-as-disk backup targets is that most backup software companies don't currently charge to back up to them; unfortunately, this is changing.

The final advantage of disk-as-disk targets is their flexibility, which may come into play if you plan on moving away from a traditional backup architecture to, for example, a data-reduction backup or a replication-based backup system. A data-reduction backup system tries to eliminate redundant blocks of backed up data, reducing the amount of data sent across the network and stored on the secondary storage system. A replication-based backup system uses replication as the mechanism to move data to a secondary location where it's then backed up. If one of these two new architectures is possibly in your future, you might want to consider a disk-as-disk target now; one advantage of disk-as-disk targets is that they're exactly what data-reduction backup systems and replication-based backup systems need as a target. You can't replicate to tape, and data reduction backup systems are also designed to go to disk-as-disk.

Disadvantages of disk-as-disk targets
Backup software companies are starting to charge for backing up to a disk-as-disk target, a trend that's expected to continue. Vendors defend this move because they're providing additional functions to their backup software. The going price to use a disk array as a staging device before data is moved to tape is approximately $2,000/TB. To use a 200TB disk array as a disk-as-disk target could add $400,000 to your backup software tab.

D2D2T backup terms
  • DISK-TO-DISK-TO-TAPE (D2D2T) BACKUPS: D2D2T backups are first sent to disk and eventually copied or moved to tape.
  • DISK-AS-DISK: A disk-based backup target that behaves as disk and doesn't take on the characteristics of tape.
  • NAS DISK-AS-DISK: A disk-as-disk backup target accessed via NFS or CIFS.
  • SAN DISK-AS-DISK: A disk-as-disk backup target accessed via Fibre Channel or iSCSI.
  • FILESYSTEM-BASED BACKUPS: Backups that are sent to a disk-as-disk backup target rather than a virtual tape library or physical tape.
  • VIRTUAL TAPE LIBRARY (VTL): A disk array and server running an application that makes the disk array look like a tape library to the backup software application.
  • INTEGRATED VTL: A VTL that directly integrates with a tape component and also manages the process of copying data from VTL disk to physical tape.
  • STANDALONE VTL: A VTL that stands by itself, like a regular tape library. It uses the backup software's tape-to-tape copy to migrate data from virtual tape to physical tape.
A disadvantage of disk-as-disk backup devices is the nature of filesystems. Files are written, opened, changed and stored back to the same place. The new version of the file often doesn't fit in the same place where the old file was, so a portion of it gets written to the original location while another part is written somewhere else on the disk, resulting in fragmentation. The more files you add, delete and modify, the more fragmented the filesystem. The way a backup system uses the disk results in significant fragmentation over time, which degrades performance.

Another issue when using disk-as-disk backup targets is that some backup software products don't back up to filesystems as well as they back up to tapes. For example, backup software products know exactly what to do when a tape fills up, but they're not always sure what to do when a filesystem fills up. Many of the major backup products require users to point disk-as-disk backups to a single filesystem. When that filesystem fills up, all the backups fail--even if another filesystem has adequate capacity. There are also other limitations, like the inability of some backup products to scan in filesystem images. If you let a tape expire from your backup catalog, most backup products will let you scan the tape, find what's on it and then enter its contents in the backup catalog. Some products can't do that with filesystem-based images.

Storing backups offsite is another challenge with disk-as-disk backup targets. The normal procedure would be to copy the disk-based backups to a physical tape and then ship the tape offsite. The problem is that most people don't currently create copies of their tape-based backups; they'd have to start doing so if they started backing up to a disk-as-disk target. Therefore, you need to learn how to copy disk-based backup data to tape and then learn how to automate the process. These two steps can range from extremely easy to extremely difficult, depending on the backup product you use, and may also require the purchase of additional software from your backup vendor. Whatever method you choose to get the data from disk to tape, remember that the data is now moving twice, where before it moved only once. This means you'll need to budget time for the data to make that second move.

One final disadvantage of disk-as-disk targets is the lack of compression. While there's currently one NAS disk-as-disk target that uses data-reduction techniques on backups stored on that device (Data Domain Inc.'s DD200), most disk-as-disk targets don't have built-in compression. This means you may need twice as much disk with a disk-as-disk target as you would with a VTL that supports compression. (In-band, software-based compression products typically come with a rather hefty performance penalty--as much as 50%. In its DX100 VTL, Quantum Corp. claims to offer hardware-based compression that doesn't degrade performance.)

SAN disk-as-disk targets
A SAN disk-as-disk target is simply a disk array connected to the SAN and attached to one or more backup servers. (Some products, like IBM's TSM, can back up to raw disk.) The backup server typically puts a filesystem on the array and writes to that filesystem. The advantage over a NAS disk-as-disk system is the better write performance typical of a high-end SAN disk array compared to an Ethernet NAS filer.

  • Most backup software companies will begin charging to back up to a disk-as-disk target.
  • Ask your backup software vendor what its plans are.
However, when you use a disk array as your backup target, you replicate into your secondary storage all of the provisioning issues of your primary storage. All of that hassle with associating disks to RAID groups, RAID groups to servers, and volumes to filesystems now needs to be done on the back end of your backup system. This problem is compounded when there are multiple backup servers. When using a tape library or VTL, most backup software packages know how to share these devices; they don't know how to share a SAN-based filesystem. If you're using a SAN disk-as-disk target with multiple backup servers, you'll have to decide how large each backup server's volume needs to be and allocate the appropriate amount of space to each backup server.

NAS disk-as-disk targets
A NAS disk-as-disk target solves the provisioning issues of a SAN disk-as-disk target by putting the disks behind a NAS head, making a giant volume and sharing that volume via NFS. Generally, such systems are easier to maintain than traditional disk arrays. But easier management comes with a price. The filer head and filer OS increase the cost of the system, and performance will be limited to the throughput of the filer head. Depending on the size of your backups, performance may not be an issue. If you're a NAS shop with many other filers, a NAS disk-as-disk target makes sense--especially if you're using replication-based backup.

  • SAN arrays will offer better backup system performance.
  • NAS filers will be easier to manage and maintain, but throughput will be limited by the filer head.
Disk-as-disk targets provide a quick and inexpensive way to start backing up to disk. Yet they have many disadvantages when used with a traditional backup system. If you're going to use a disk-as-disk system, you'll need to choose either a SAN or a NAS unit. A SAN device may be more powerful than a NAS unit, but the SAN device will be more difficult to maintain and share.

Next we'll explore how to use VTLs with your backup system; describe their advantages and disadvantages vs. disk-as-disk systems, and explain the advantages and disadvantages of the two different kinds of VTLs.

VTL advantages
VTLs offer two main advantages over disk-as-disk backup targets: ease of management and better performance. As described earlier, there are various ways to use disk to protect data. A disk-as-disk target requires all of the usual provisioning steps of standard shared storage arrays. In contrast, if you tell a VTL how many virtual tape drives and virtual cartridges it should emulate, the VTL software automatically handles all of the provisioning and allocates the appropriate amount of disk to each virtual cartridge.

If the VTL needs to be expanded (not all VTLs are expandable), you just connect the additional storage, tell the VTL it's there and the VTL will automatically begin using the new storage. There's no volume manager to run and no RAID groups to administer.

Another important management advantage of VTLs is how easy it is to share VTLs among multiple servers and apps (see "Should a virtual tape library be shared?"). To share a VTL among multiple backup servers running the same software, use the built-in library sharing capability most backup products have. To share a VTL among multiple servers running different apps, partition the VTL into multiple smaller VTLs, assign a number of virtual cartridges to each VTL and associate each VTL with a different backup server. These scenarios are much easier than what's required to share a disk-as-disk target among multiple backup servers.

Better performance
To understand the performance advantages of VTLs, think of how backup applications write data to tape. A backup app typically continues writing to a tape until it hits the physical end of tape (PEOT). It will append to a tape, even if some of the previously written data has expired. Once the backup app hits PEOT, the tape is considered full. Most backup apps leave everything on the tape until all of the backups on that tape have expired; then they expire the whole tape and write to it from the beginning. Other backup apps wait until a certain percentage of the backups on a tape have expired before "reclaiming" that tape by migrating the non-expired backups to a second tape. The first tape is then expired and ready to be overwritten. The bottom line is that portions of a tape can't be overwritten.

Veritas NetBackup (now owned by Symantec) supports a feature called inline tape copy, which allows sending a backup to two tape drives simultaneously--creating an original and a copy in one step. An alternative is to use a standalone virtual tape library (VTL), and to send one copy to the physical tape and one to the VTL. The shortcoming with this approach is that it causes the VTL to run at the speed of the tape drive--defeating the purpose of going to disk backup in the first place. A more interesting approach would be to use an integrated VTL, send both backups to virtual tape, and then use the VTL to create the physical tape in the background.
This differs from how backup applications write to a file system. The application tells the OS it wants to write to a certain file name and then begins writing data to that file. Each backup gets its own file and when that file expires, it's deleted. The backup application has no knowledge of how this data is actually written to disk. Underneath the covers, the bytes of any given file are fragmented all over the disk, which results in performance degradation of the backup.

Because a VTL treats disk like tape, it eliminates fragmentation by writing backups to contiguous sections of disk. The blocks allocated to a tape stay allocated to that tape until the backup app starts overwriting that tape, at which point the VTL can once again write to contiguous sections of disk--just like data is written to tape. Because VTL vendors control the RAID volumes, they ensure that a given RAID group is only written to by a single virtual tape. A disk can perform much better if it's only writing/reading for a single app using contiguous sections of disk. This key difference explains why the fastest file systems write in hundreds of megabytes per second, while the fastest VTLs write in thousands of megabytes per second.

VTLs offer other advantages, as well. With one exception (see the next section), VTLs work with all existing backup software, processes and procedures (see "NetBackup's inline tape copy," this page, and "Do IBM Tivoli Storage Manager users need a VTL?"). In other words, everything works exactly as it would with a physical tape library (PTL). That isn't the case with disk-as-disk targets, where backup software can behave quite differently.

VTL disadvantages
The disadvantage of VTLs cited by most storage admins is cost. They believe that if a disk array costs x, a disk array made to look like a VTL will cost x + y. But the y factor can vary from one VTL vendor to another. Most VTLs use capacity-based pricing, which means the cost is $/GB. At least one VTL vendor uses throughput-based pricing, so the price is determined by the number of Fibre Channel (FC) connections. The actual price of VTLs with disk included ranges from less than $4/GB to a little more than $12/GB. Disk-as-disk units fall into roughly the same price range, so it's basically a misconception that a VTL will always cost more than a disk-as-disk device.

Another issue is the price of backup software licensing. If a VTL sits next to an existing tape library, it will most likely require an additional tape library license for a library that's actually not there. This adds to the price of the VTL. How much you pay is based on how the VTL is configured and how your backup software charges for libraries. Some backup software products (e.g., IBM's TSM) have a single license for all tape libraries, while others charge for the number of slots or drives. When deciding how to configure your VTL, consider how your backup software charges for libraries. When comparing VTLs to disk-as-disk targets, you also need to remember that backup software products are beginning to charge to back up to disk-as-disk targets. These licensing challenges will probably go away as backup software vendors move toward capacity-based pricing in an effort to appear more VTL friendly. (NetBackup offers this kind of pricing today.)

VTLs offering compression use in-band software compression that saves space, but results in a significant performance hit--as much as 50%. If your backup speed is throttled by the speed of your clients and/or network, you may not see this performance hit. But in local or LAN-free backups, speed tends to be most affected by the backup device. Some vendors perform their compression after the fact, attempting to give you the benefits of compression without the performance loss. As of this writing, only Quantum supports hardware compression that doesn't impact performance.

Partitioning makes it possible to share a VTL among backup servers running the same application; however, this can increase costs if your backup software charges by the drive. For example, assume you have seven servers, each of which needs 10 tape drives once a week for their full backup. You could create 10 virtual tape drives and share them, or you could create 70 virtual drives and give each server the 10 tape drives it needs.
Ejecting virtual tapes
How you eject virtual tapes will determine whether you require a standalone or integrated VTL. As discussed previously, a major advantage of VTLs is that they don't require any changes to your existing backup process or configuration. The one exception is if you don't copy your backup tapes and send the copies offsite. Although it isn't a best practice to do so, many environments eject their original tapes and send them offsite. This works fine with a PTL but, as of this writing, only one VTL (Spectra Logic) supports the ejection of virtual tapes. Therefore, companies that eject their original tapes and wish to use a VTL must do one of two things: learn how to copy tape or use an integrated VTL. The approach that's best for your environment will be based on individual preference.

Some observers believe the tape-to-tape copy method with standalone VTLs is the only proper way to create physical tapes from virtual tapes. (Standalone VTLs include those from Diligent Technologies Corp., Quantum and Sepaton Inc.) The tape-to-tape copy method allows the backup software to control the copy process, integrating the copy process into normal reporting procedures. However, there are two challenges. The first is the difficulty related to automating this process. Some backup products require the purchase of an additional license, and some need a custom script for this process. The second challenge is that many environments don't have enough time and resources to copy their backup tapes quickly enough. For many companies, it's all they can do to get their backups done in time to be picked up by Iron Mountain. If you know how to copy your backup tapes, and have sufficient resources to do so, this won't be an issue.

If the challenge of copying virtual tapes to physical tapes is a concern, you should consider an integrated VTL, such as those offered by Advanced Digital Information Corp. (ADIC), Alacritus Software, EMC Corp., FalconStor Software, Maxxan Systems Inc., Neartek Inc. and Spectra Logic.

An integrated VTL sits between your backup server and PTL. It inventories the PTL and represents its contents as virtual tapes in the VTL. For example, if you have physical tape X01007 in your PTL, virtual tape X01007 will appear in your VTL. Your backup software will then back up to virtual tape X01007. At some user-configurable point, the VTL internally copies virtual tape X01007 to physical tape X01007. When the backup software tells the VTL to eject virtual tape X01007, physical tape X01007 appears in the PTL's mail slot. An important point is that physical tape X01007 looks just like it would if the backup software had backed up to it directly. The backup software thinks it backed up to and ejected physical tape X01007 and, in the end, that's what it did. Bar-code matching maintains the consistency between the backup software's media manager and the physical tapes. But you need to remember that this method doesn't result in two copies of the tape. The virtual copy of the tape is deleted when the physical copy is successfully created.

There are some issues with this method. For example, what happens when the copy from the virtual tape to the physical tape fails? If the copy failed because the actual tape is bad, you'll need to remove the tape, swap its bar code to a new tape, put the new tape in the PTL and tell the VTL to try the copy again. (This will work only if your bar codes are removable.) If this happens occasionally, it's not a major disadvantage. But if it happens every day, it becomes disruptive.

While IBM's Tivoli Storage Manager (TSM) backs up directly to disk quite well, TSM administrators will experience provisioning and fragmentation issues if they begin storing all onsite backups on disk. (Most TSM disk storage pools aren't fragmented because they're immediately migrated to tape every night.) So, the advantages of virtual tape libraries (VTLs) apply to TSM as much as they apply to other backup products. In addition, a VTL would let TSM users create thousands of small virtual tapes, allowing them to turn on collocation for all clients without the usual penalty of hundreds of partially used tapes. It would also allow users to have dozens of virtual tape drives to perform reclamation at any time without causing contention.
You also need to realize that this process is happening without the knowledge of the backup software, so if something happens with a tape copy, the VTL will need to notify you of the problem. This results in another reporting interface, which might be considered a disadvantage. Another potential problem arises if the VTL puts more data on the virtual tape than can fit on the physical tape, preventing creation of a physical copy of the tape. Integrated VTL vendors ensure that this doesn't happen by stopping before the normal PEOT. However, standalone vendors might say this practice increases the number of tapes to purchase and handle, and adds to your costs.

Important VTL features
There are a number of differences among the major VTLs. Some (Alacritus, Diligent, FalconStor) are software only, so you can buy the software and run it on a regular disk array. Other VTL vendors (Maxxan, Neartek) sell a VTL head, which is analogous to a filer head. You use their software and head, but supply your own disk. Finally, some VTL vendors (ADIC, EMC, Quantum, Sepaton and Spectra Logic) offer an entire solution: software, head and disk. Software-only and filer head vendors allow you to redeploy an existing array, reducing your cost. Turnkey products cost more, but have the fewest integration issues.

Most VTLs offer replication or cascading, which replicates one VTL's backups to another VTL. But the tapes in the second VTL won't be considered duplicates by your backup software because they'll have the same bar codes as the original tapes. Also, remember that you'll probably be replicating the entire backup, and most backups aren't block level. Even incremental backups take up roughly 1% to 5% of the amount of data being backed up. This means you'll need to replicate 1% to 5% of your data center every night--a significant undertaking for many environments. Therefore, it may only be possible to use this feature within a campus, as opposed to including data from remote sites. Today, replication is offered by Alacritus- and FalconStor-based VTLs.

Some VTL vendors are beginning to offer a feature where their VTLs will examine the incremental backup, identify the changed blocks within that backup and replicate only the changed blocks. When that functionality becomes more widely available, replication between data centers will be much easier to accomplish. Diligent is the first to announce such a product with its ProtecTier offering.

If you have a heterogeneous environment with mainframe, AS/400 and open systems, you might consider a VTL that supports all three environments. Only Neartek currently offers this functionality.

A few integrated VTLs (FalconStor and Neartek) offer a feature called stacking. Stacking copies multiple virtual tapes onto one physical tape, a feature borrowed from mainframe virtual tape systems (VTS). Stacking was important to mainframes because apps were unable to append to a tape. The VTS would present hundreds of small virtual tapes to the app and then stack those virtual tapes onto one physical tape, significantly cutting media costs.

However, the value of stacking in most open-systems environments is questionable because any decent backup product can append to a tape until it's full. You should be aware that the use of stacking breaks the relationship between the backup software's media manager and the physical tape. Products that support stacking must read the entire stacked tape to read just one of the virtual tapes included on that tape. This feature is useful only if you gain a benefit akin to that achieved in the mainframe environment.

Alexander Dubose Jones and Townsend LLP, a small appellate law firm with offices in Houston and Austin, Texas, moved from tape to LiveVault Corp.'s InSync (Live- Vault service), a continuous data protection product. Vicki McArthur, the firm's administrator, says they had previously relied on daily tape backup as well as on a seven- day offsite tape rotation. The firm experienced all of the challenges traditionally found in tape environments, but recoveries concerned McArthur the most. Nightly backups don't work well with the nature of the legal industry, where files often require last-minute changes. Under a traditional backup system, files created in the morning wouldn't get backed up until that night, and wouldn't be sent offsite until at least the next day. "We faced the possibility of losing an entire day's worth of work or worse," says McArthur.

LiveVault makes a backup of files as soon as they're saved, and then replicates them to a remote site within a few minutes, where all previous versions of any file are accessible at any time. Because only changed bytes are sent, very little bandwidth is required. And since data is replicated every 15 minutes, McArthur believes that "the amount of data loss due to user error is reduced to minutes, possibly less."

You also need to think about which type of notification the VTL supports, especially if you're considering an integrated VTL. Some support SNMP traps, a few support e-mail notification, while others require a storage admin to log into a Web page to be notified of any issues.

If high-end performance is important, you should look for a VTL with a multiple data-mover architecture. Most VTLs run all software on one VTL head. Some vendors use the VTL head as a control mechanism, while passing the movement of the data on to one or more data movers. Need more performance? Simply purchase more data movers. This allows scaling to a much higher level without having to add and administer another VTL (Diligent, Neartek and Sepaton use this approach).

Finally, remember that VTLs don't perform at the same level, so it's important to conduct performance testing in your environment.

Alternative backup methods
If you have a centralized data center with a four-hour recovery time objective (RTO), a 24-hour recovery point objective (RPO), a 24-hour synchronicity requirement and an eight-hour backup window, you can stop reading now. But if your backup requirements include remote, unattended data centers, a five-minute RTO, a 15-minute RPO or a non-existent backup window, alternative backup systems can help bring some needed sanity to your storage environment.

Alternative backup options include snapshots, replication, continuous data protection (CDP) and data reduction backup (DRB). These technologies will reduce backup and restore times, and help meet requirements such as RTO, RPO, backup window and synchronicity.

RTO--how long it takes to recover a system--can range from zero seconds to several days or even weeks. Each piece of information serves a business function, so the question is how long the business can live without that function. If the business can't live without it for one second, then the RTO is zero.

RPO is determined by how much data a business can afford to lose. If the business can lose three days' worth of a set of data, then the RPO is three days. If the data is real-time transactions essential to the business, the RPO is zero for that application.

There can also be an RPO for a group of machines. If several systems are related to each other, they may need to be recovered to the same point in time. This is the synchronicity requirement; to meet it, all related systems have to be backed up at exactly the same time. This is referred to in disaster recovery circles as consistency groups.

Setting RPO, RTO requirements
All RPO, RTO and synchronicity requirements must be business-centric. Before deciding what these requirements are, you should first analyze and prioritize the business functions, and assign each computer system the recovery priority of the business function it serves. Next, decide on an RTO and RPO for each system and type of disaster--from the loss of a disk to the loss of a metropolitan area. Some systems will have the same requirements for all types of disasters; others may have tougher requirements for specific types of disasters.

Once you've determined an RTO and RPO for each system and disaster type, the final step is to determine how long it will take to back up the system and how much the backup will impact the production system.

Everything should start with RTO and RPO, although very few people do it that way. Most people go right to the backup window. Instead, you should concentrate on meeting your RTO and RPO requirements, and the backup window will almost always fall right in line. The reverse isn't necessarily true, however. There are many things that will shrink your backup window but not help your recovery objectives. If your requirements are impossible to meet with a traditional backup system, the following technologies are worth considering.

SNAPSHOTS. The most common type of snapshot is a virtual copy of an original volume or file system. The reliance on the original volume is why snapshots must be backed up to provide recovery from physical failures (see "Match snaps to apps," p. 46). Snapshot functionality resides in a number of places, including advanced filesystems, volume managers, enterprise arrays, NAS filers and backup software.

First American Trust Federal Savings Bank, Santa Ana, CA, handles up to $2 billion worth of wire transfers each day. The bank was recently asked by the Securities and Exchange Commission (SEC) to restore one year's worth of Microsoft Exchange e-mail data--a significant request.

One division used Network Appliance (NetApp) Inc.'s unified storage solution, SnapManager for Exchange, and Single Mailbox Recovery software, while another division used traditional backup and tape. The results from the two divisions couldn't have been more different. "The SEC request made the need for using nearline storage to easily recover and access e-mail undisputable," says Henry Jenkins, chief technology officer at First American. "Our disk-based solution rose to the occasion, but damaged tapes and botched backups made restoring from tape excruciating for our sister division."

It took the bank only a few days to restore roughly 360GB of e-mail using the combination of hardware and software from NetApp. In contrast, it took several months for one IT bank staffer to restore a smaller volume of e-mail from tape.

First American also uses offsite replication of critical SQL Server databases, Exchange e-mail and flat-file data that's used to perform routine wire services. All of this critical data creates only 200MB of changed data blocks per day, which are then asynchronously replicated to a remote system located at a disaster recovery (DR) site approximately 100 miles away. The DR system has an RPO of four hours in the event of a site failure.

"SnapMirror software saves us time by not having to replay logs and data at the remote site is, on average, less than 15 minutes behind," says Jenkins. "Every year for the past three years, we've done a disaster recovery test and every year it's just a matter of bringing up the warm servers," he adds.

Snapshots can help you to meet aggressive backup requirements. For example, some snapshots can satisfy an RTO of a few seconds by simply changing a pointer. An aggressive RPO can be achieved by creating several snapshots per day and, because snapshots can be created in seconds, you can also meet stringent backup window requirements. For instance, it's possible to create a stable, virtual backup of a multiterabyte database in seconds--reducing the impact on the application to potentially nothing--which leaves hours to perform a backup of that snapshot. Finally, creating synchronized snapshots on multiple systems is also fairly easy.

There's a growing list of APIs that allow different vendors' products to interface with snapshots; the network data management protocol (NDMP) and Microsoft Corp.'s Volume Shadow Copy Service (VSS) are examples. NDMP lets backup products create a snapshot, and catalog and restore from its contents. VSS allows storage vendors with snapshot capability to have the files in those snapshots listed in and restored from the Previous Versions tab in Windows Server 2003. Hopefully, this capability will be added to workstation versions of Windows and more NAS vendors will support VSS.

Another interesting development is the creation of database agents that work with snapshots. The database agent communicates with the database so that the database believes it's being backed up, when all that's really happening is the creation of a snapshot. Recoveries can be incredibly fast when the process is controlled by the database application.

REPLICATION. Replication is the practice of continually copying from a source system to a target system all files or blocks that have changed on the source system. Replication used to be what companies implemented after everything was completely backed up and redundant, which meant that few used replication. However, many people are now using replication as their first line of defense for providing backup and disaster recovery.

Replication by itself is not a good backup strategy; it copies everything, including viruses and file deletions. Therefore, a replication-based backup system must be able to provide a history by either occasionally backing up the replicated destination or through the use of snapshots. It's usually preferable to make a snapshot on the source and replicate that snapshot to the destination. That way, you can prepare database applications for backup, take a snapshot and then have that snapshot replicated.

Kelly Overgaard, systems manager at Adaptec Inc., was fed up with tape. "Our old system was at capacity, and something was always breaking," he says. "When we looked at disk-based solutions, our goal was to completely get rid of tape--especially for remote sites."

Adaptec chose an Avamar Technologies Inc. Axion system that uses "commonality factoring" to identify duplicate blocks of data throughout its enterprise and to transmit only the new, unique blocks of data each time it backs up. This allows Adaptec to back up and recover smaller remote offices directly to its central data center. Larger offices, or those with shorter recovery time objectives, can be backed up to a local target device at the remote site, which then replicates to a second device in its central data center. This flexibility to use (or not use) a local recovery device let Adaptec deploy this solution to several sites.

Overgaard says that because the commonality factoring is performed on the client, it requires slightly more CPU than traditional backup, but "no one has mentioned any ill effects." He considers himself a happy customer, but says he's unsure if the system will be able to back up Adaptec's large databases.

But Overgaard doesn't believe he can afford to store his firm's backups with long-term retention on the Axion system, so he also performs a monthly full tape backup of Axion clients using Adaptec's previous tape system, and then sends that offsite for several years. Avamar says he'll soon be able to make such tape backups by simply exporting the appropriate data directly from the Axion system.

When used with snapshots, replication requires only tiny backup windows. The snapshot takes just seconds to create, and replication is the quickest way to back up that snapshot to another device. You can also cascade replication to provide multiple copies, such as an onsite and offsite copy. If you want to provide a tape copy of the replicated snapshot, just back up one of the destination devices. But replication software doesn't usually provide recovery features. The RTO, RPO and synchronicity requirements that you'll be able to meet will be based on how you're performing snapshots or backups, and how quickly they'll be able to recover.

DRB SYSTEMS. DRB systems were designed to answer the following questions: If only a few bytes in a file change, why back up the entire file? If the same file resides in two places on the same system, why back it up twice? Why not store a reference to the second file? And why waste server and network resources by backing up the same file across multiple systems?

By backing up a file once, and then backing up only the changed bytes, backup windows are reduced. Tape copies of disk-based backups can usually be created at any time, depending on your requirements. Some DRB products can meet aggressive RTO requirements by restoring only the blocks that have changed since the file was last backed up. The RPO and synchronicity abilities of DRB products are based on how often you back up, but it's common to back up hourly.

The biggest advantage to DRB products is that, from the user adoption perspective, they're the closest to what users know. Their interfaces are similar and they often have database agents like traditional backup software. They're also able to back up faster and more often, and use much less bandwidth.

CDP. A CDP system is basically an asynchronous, replication-based backup system. The software runs continuously on the client to be backed up, and each time a file changes, the new bytes are sent to the backup server within seconds or minutes. But unlike replication, a CDP system can roll back to any changes at any time.

Pros and cons of alternative backup methods
CDP products transfer data to the backup server in different ways. Some transfer changed blocks immediately, while others collect changed blocks and send them every few minutes. They also differ in how they do recoveries. Some products are able to restore only the blocks that have changed from a particular point in time, while other programs operate in a more traditional manner by recovering the entire file or filesystem. The first method accommodates more aggressive RTOs and RPOs than the second method. Also, CDP products can meet any type of synchronicity requirement because they can recover one, 10 or 100 systems to any synchronized point in time.

Another difference in CDP products is that some are database-centric and work only with a particular database, such as Microsoft Exchange or SQL Server. Most file-based CDP products aren't going to provide interfaces for your database apps. These CDP products copy blocks to the backup destination in the same order they're changed on the client. Restarting your database causes it to go into the same mode that it would go into if the server were to crash (i.e., crash recovery mode). It examines the data files, figures out what's inconsistent, rolls backward or forward any necessary transactions or blocks, and then the database is up. If the CDP product puts the blocks back in the exact order in which they were changed, then the database should be able to recover from any point in time. Some products can even present a logical unit number or volume to your database that it can mount and test before you do the recovery.

Some CDP vendors, like Kashya and Mendocino, integrate with database vendors. In addition to continuously copying blocks from source to destination, they integrate with your apps to create consistent recovery points that can be used to recover your database without it having to go into crash recovery mode. Keeping the app out of recovery mode can save a lot of time during a restore.

Your database vendor may have a different opinion about CDP: If you're not using their supported backup method, they may not be helpful if something goes wrong. Discuss the support issue with your database vendor and include your DBA in the discussion.

Aggressive requirements
You should consider switching backup products only if your current backup product can't meet your requirements (see "Pros and cons of alternative backup methods," previous page). There are many requirements--such as remote office data protection, backing up large databases, and an app with an RPO of zero--that might have you considering alternatives.

Veritas Software's NetBackup 6.0 (now owned by Symantec) addresses many manageability limitations with its disk storage units (DSUs), the firm's term for disk-as-disk backup targets. Version 6.0 users will now be able to configure the size of each DSU, point backups to a group of DSUs, and have those backups failover to other DSUs based on a choice of usage algorithms. Disk-staging storage units will be able to perform multiple, simultaneous de-staging processes.

It gets more interesting when a Network Appliance NearStore device is used as a NetBackup DSU. With a NetBackup 6.0 master/media server, NearStore will perform data reduction techniques on the incoming NetBackup data stream, reducing the amount of actual disk it will take to store full and incremental backups, thus reducing the effective per-gigabyte cost of the total solution. The next version of NetBackup will present backups as NFS- and CIFS-mountable snapshots, allowing a user to browse through their backed up files without using the NetBackup GUI or bothering NetBackup admins. While this is a great feature, storage admins should consider its security implications before a company-wide implementation.

The most common area where backup requirements are difficult to meet is the remote office. Traditional backup schemes can't meet remote office RTO/RPO requirements. There's either too much data or not enough bandwidth to support a reasonable RTO or backup window. Any CDP product can provide backup and recovery of a remote office; most offer two methods. If long RTOs are acceptable, remote sites can back up directly to your central office. In the case of a disaster, just copy the data from the central data center to a disk or tape and send it to the remote site. If this meets RTO requirements, it's the least-expensive option. For tighter RTO requirements, install a backup device at the remote office. The remote office systems can back up to it, and it can then replicate the data to the central site. This provides local recovery and disaster recovery without touching a tape.

CDP products are also superior to traditional backup methods when backing up very large databases. There isn't enough time or horsepower available to transfer several terabytes of data to tape every day. A CDP product could continually back up a database throughout the day, with no noticeable backup window or application impact. Depending on the product, a stringent RTO and short RPO could also be met. Some products also provide a disk-based copy that can be used in a disaster situation while the real volume is being recovered.

Finally, some database applications require a zero RPO. Most databases can meet such a requirement if they're configured correctly, and if the transaction log is backed up throughout the day. If your database supports that kind of functionality, it's probably best to stick with it. If not, try one of these newer methods.

Article 19 of 25

Dig Deeper on Data storage strategy

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All