There's a growing list of APIs that allow different vendors' products to interface with snapshots; the network data management protocol (NDMP) and Microsoft Corp.'s Volume Shadow Copy Service (VSS) are examples. NDMP lets backup products create a snapshot, and catalog and restore from its contents. VSS allows storage vendors with snapshot capability to have the files in those snapshots listed in and restored from the Previous Versions tab in Windows Server 2003. Hopefully, this capability will be added to workstation versions of Windows and more NAS vendors will support VSS.
Another interesting development is the creation of database agents that work with snapshots. The database agent communicates with the database so that the database believes it's being backed up, when all that's really happening is the creation of a snapshot. Recoveries can be incredibly fast when the process is controlled by the database application.
Replication by itself is not a good backup strategy; it copies everything, including viruses and file deletions. Therefore, a replication-based backup system must be able to provide a history by either occasionally backing up the replicated destination or through the use of snapshots. It's usually preferable to make a snapshot on the source and replicate that snapshot to the destination. That way, you can prepare database applications for backup, take a snapshot and then have that snapshot replicated.
When used with snapshots, replication allows for tiny backup windows. The snapshot takes just seconds to create, and replication is the quickest way to back up that snapshot to another device. You can also cascade replication to provide multiple copies, such as an onsite and offsite copy. If you want to provide a tape copy of the replicated snapshot, just back up one of the destination devices. But replication software doesn't usually provide recovery features. The RTO, RPO and synchronicity requirements that you'll be able to meet will be based on how you're performing snapshots or backups, and how quickly they'll be able to recover.
DRB systems. DRB systems were designed to answer the following questions: If only a few bytes in a file change, why back up the entire file? If the same file resides in two places on the same system, why back it up twice? Why not store a reference to the second file? And why waste server and network resources by backing up the same file across multiple systems?
By backing up a file once, and then backing up only the changed bytes, backup windows are substantially reduced. Tape copies of disk-based backups can usually be created at any time, depending on your requirements. Some DRB products can meet aggressive RTO requirements by restoring only the blocks that have changed since the file was last backed up. The RPO and synchronicity abilities of DRB products are based on how often you back up, but it's common to back up hourly.
The biggest advantage to DRB products is that, from the user adoption perspective, they're the closest to what users know. Their interfaces are similar and they often have database agents like traditional backup software. They're also able to back up faster and more often, and use much less bandwidth.
CDP. A CDP system is basically an asynchronous, replication-based backup system. The software runs continuously on the client to be backed up, and each time a file changes, the new bytes are sent to the backup server within seconds or minutes. But unlike replication, a CDP system can roll back to any changes at any time.
CDP products transfer data to the backup server in different ways. Some transfer changed blocks immediately, while others collect changed blocks and send them every few minutes. They also differ in how they do recoveries. Some products are able to restore only the blocks that have changed from a particular point in time, while other programs operate in a more traditional manner by recovering the entire file or file system. Obviously, the first method accommodates more aggressive RTOs and RPOs than the second method. Also, CDP products can meet any type of synchronicity requirement because they can recover one, 10 or 100 systems to any synchronized point in time.
Another difference in CDP products is that some are database-centric and work only with a particular database, such as Microsoft Exchange or SQL Server. Remember that, unlike traditional backup products, file-based CDP products aren't going to provide interfaces for your database applications. These CDP products copy blocks to the backup destination in the same order they're changed on the client. Restarting your database causes it to go into the same mode that it would go into if the server were to crash. It examines the data files, figures out what's inconsistent, rolls backward or forward any necessary transactions or blocks, and then the database is up. If the CDP product puts the blocks back in the exact order in which they were changed, then the database should be able to recover from any point in time. Some products can even present a logical unit number or volume to your database that it can mount and test before you do the recovery.
Of course, your database vendor may have a different opinion about CDP: If you're not using their supported backup method, they may not be helpful if something goes wrong. Discuss the support issue with your database vendor, and include your DBA in the discussion.
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Replicating data every 15 minutes |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
Alexander Dubose Jones and Townsend LLP, a small appellate law firm with offices in Houston and Austin, Texas, moved from tape to LiveVault Corp.'s InSync (LiveVault service), a continuous data protection product. Vicki McArthur, the firm's administrator (above), says they had previously relied on daily tape backup as well as on a seven-day offsite tape rotation. The firm experienced all of the challenges traditionally found in tape environments, but recoveries concerned McArthur the most. Nightly backups don't work well with the nature of the legal industry, where files often require last-minute changes. Under a traditional backup system, files created in the morning wouldn't get backed up until that night, and wouldn't be sent offsite until at least the next day. "We faced the possibility of losing an entire day's worth of work or worse," says McArthur.
LiveVault makes a backup of files as soon as they're saved, and then replicates them to a remote site within a few minutes, where all previous versions of any file are accessible at any time. Because only changed bytes are sent, very little bandwidth is required. And since data is replicated every 15 minutes, McArthur believes that "the amount of data loss due to user error is reduced to minutes, possibly less." |
|
 |
 |
 |
 |
 |
 |
 |
Aggressive requirements
You should only consider switching backup products if your current backup product can't meet your requirements (see Pros and cons of alternative backup methods). There are a number of requirements that might prompt you to consider alternatives, such as:
- Remote office data protection
- Backing up large databases
- An application with an RPO of zero
The most common area where backup requirements are difficult to meet is the remote office. Traditional backup schemes can't meet remote office RTO/RPO requirements. There's either too much data or not enough bandwidth to support a reasonable RTO or backup window. Any CDP product can provide backup and recovery of a remote office; most offer two methods. If long RTOs are acceptable, remote sites can back up directly to your central office. In the case of a disaster, just copy the data from the central data center to a disk or tape and send it to the remote site. If this meets RTO requirements, it's the least-expensive option. For tighter RTO requirements, install a backup device at the remote office. The remote office systems can back up to it, and it can then replicate the data to the central site. This provides local recovery and disaster recovery without touching a tape.
CDP products are also superior to traditional backup methods when backing up very large databases. There isn't enough time or horsepower available to transfer several terabytes of data to tape every day. A CDP product could continually back up a database throughout the day, with no noticeable backup window or application impact. Depending on the product, a stringent RTO and short RPO could also be met. Also, some products provide a disk-based copy that can be used in a disaster situation while the real volume is being recovered.
Finally, some database applications require a zero RPO. Most databases can meet such a requirement if they're configured correctly, and if the transaction log is backed up throughout the day. If your database supports that kind of functionality, it's probably best to stick with it. If not, try one of these newer methods.