Are full backups a thing of the past?

In the not-so-distant past, we relied on tape backups for operational recovery, disaster recovery and long-term data retention. But are full nightly backups to tape still needed now that we have new disk-based technologies like snapshots and continuous data protection?

This article can also be found in the Premium Editorial Download: Storage magazine: Using two midrange backup apps at once:

It might be premature to declare nightly full backups dead, but tools like continuous data protection and snapshots can reduce a company's dependence on full backups without compromising data protection.

I recently helped a Fortune 100 client evaluate continuous data protection (CDP) products and was asked if full nightly backups to tape were still needed given the advent of disk-based technologies like snapshots and CDP. Trying not to sound like a consultant attempting to please everyone, I answered "It depends."

Snapshots and CDP provide excellent ways to recover data from a specific point in time when no known data loss or corruption existed. But it can take a while to discover a file or database error, and snapshots aren't typically kept for a long time. That's why full backups to tape are a vital complement to point-in-time backup tools, and will remain so in any comprehensive data protection strategy. The tricky part is to decide how to mix and match backup methods to best meet recovery time objectives (RTOs) and recovery point objectives (RPOs), and stay within your storage budget.

In the not-so-distant past, we relied on tape backups for operational recovery, disaster recovery (DR) and long-term data retention. The latter two are often treated as discrete functions with separate data copies even though tape may be the target media. Operational recovery usually applies to the loss of data in normal production operations. In those cases, we would use onsite or offsite tapes to recover the data to the point of the last successful backup (our RPO).

Many companies still rely on offsite tapes to recover data after a disaster. Despite advances in tape technology, the tremendous increase in data is making tape backup more time-consuming and less likely to meet business requirements for timely access to data should some form of data loss occur. These issues are driving many storage teams to look at evolving or newer technologies such as virtual tape libraries (VTLs), snapshots and CDP. Here's where these newer data protection technologies fit best.

Detectable file deletion/corruption. When data is accidentally deleted, there's usually an immediate realization that an error has occurred. Traditional tape backup can be used as the data protection vehicle for accidental deletions. However, the quantity of data and its impact on RPO and RTO requirements must be considered (see "How to determine appropriate RPOs and RTOs," below).

How to determine appropriate RPOs and RTOs
Identifying business requirements and determining the technology best suited to satisfy those requirements is a proverbial "chicken-and-egg" question: Which came first? Most companies don't have the opportunity to start with a clean slate and collect all of their requirements before selecting and delivering technologies to support various service levels.

One approach is to clearly document the data protection services currently delivered in terms of recovery point objective (RPO) and recovery time objective (RTO) combinations, as well as the unit cost (e.g., $/GB) to deliver each of those services.

This information can then be presented to internal customers to determine if IT is over- or underdelivering based on business requirements. Internal customers should have an understanding of their business impact analysis, which will help them compare the value of their data to the cost of data protection for a given RPO/RTO level.

This process will help to align business requirements with existing IT services, identify the need to modify existing services, or help you decide if you need to develop new IT services to meet specific RPO and RTO requirements.

VTLs can improve recovery but, unlike snapshots or CDP, they're not as simple to deploy. And there's still a physical tape-like recovery process that must be followed because the VTL has to first recover the data from tape if the data has already been moved to tape. However, the ability to capture multiple point-in-time data images throughout the business day, and to directly mount and use a snapshot volume, lets you meet shorter RPOs.

CDP offers benefits similar to those of snapshots, with the added capability of very granular RPOs.

Sometimes a corrupt or deleted file isn't discovered for days, weeks or months. This seems to be a sweet spot for traditional tape backups and virtual tape. While the RPO will be missed because the problem wasn't discovered for some time, noncorrupted data prior to the error is safe and can be recovered.

Storage device failure. Failure of an unprotected drive or a RAID double fault can mean the loss of significant amounts of data. This risk is usually understood and addressed by various types of redundancy.

If additional redundancy isn't implemented or implemented to the appropriate degree, the recovery options are similar to those of detectable file deletion or corruption. Given the amounts of data that can be impacted by this type of error, RPO and RTO are important considerations. Disk-based backup technologies may be preferred to meet these objectives.

Interdependency failure. This can be thought of as "effective" data loss due to lack of synchronization or data inconsistency across multiple application components. If one level of service is provided to a portion of an interdependent environment and a different level is provided to another portion, the overall protection is only as good as the lesser of the two levels of service.

While tape can be used for recovery of multiple interdependent data sets, it will likely require cleanup because traditional tape backups have no way to guarantee consistency across apps. Disk-based technologies such as snapshots and CDP often provide some support for consistency groups. It's important to look at the various offerings to determine if they support the type of consistency needed. This depends on whether the disk-based data protection product is host-, appliance- or array-based; depending on where the product sits in the SAN, it will have different knowledge of the consistency of the data between the storage gear it's primarily supporting.

Site failure. The loss of a site falls under the realm of DR. The scale and organizational impact of this scenario differentiates it from more localized operational data-loss scenarios.

Traditional tape backups are still used by many organizations to recover from this type of failure. In all cases, this involves shipping a significant amount of tape from one site to another. Recovery from tape must then be prioritized before the long process of recovering from tape begins. Various VTL products offer a replication capability that can speed this process. However, with VTLs there's still a physical tape-like recovery process if the data has been moved from the VTL to tape.

Disk-based technologies can play a key role in rapid recovery from a site failure. Improvements in network bandwidth and compression technology have allowed many companies to deploy asynchronous or synchronous replication applications to copy live data or snapshots from one location to another.

Some CDP products can also replicate between sites. One of the caveats of traditional replication is that both good and bad data are replicated. By applying a CDP approach to replication, replicated data at the alternate data center could also include the dimension of time so replicas can be rolled back to a granular point in time prior to corruption.


It's not just about failures
The following broader topics should be considered when evaluating disk-based data protection solutions.

Integration. Because no single tool can perform and manage traditional backups, snapshots, replication and CDP, it's likely you'll deploy multiple data protection products. It's important to understand how--or if--these tools can integrate with each other. As part of a cohesive data protection strategy, you need to know the flow of data through the various layers of data protection (see "Balancing cost and risk," below).

Initial synchronization. Think of this as the equivalent of a traditional full tape backup. Every disk-based solution needs a starting point, and initial synchronization is usually it. CDP must create an initial copy of your data somewhere and replication also creates an initial copy. This initial synchronization process should be understood from a "how" and a "how long" standpoint.

Application support. You also need to know the backup requirements of a specific app. For example, making a snapshot of an Oracle database, Microsoft SQL Server instance or Exchange environment may require the data protection product to perform certain pre-tasks or interfaces with specific APIs. One database product could mount a CDP volume from any point in time and automatically recover to the last consistent database state, while another's best practice recommends that you quiesce the database periodically to ensure consistent recovery points.

Storage requirements. Additional storage is required for each disk-based data protection solution. An understanding of business requirements and estimates of metrics such as data change rate will help you estimate additional capacity needs.

Monitoring. You should periodically test your recovery process, but the monitoring and reporting capabilities of the data protection solutions should help you determine if your apps are adequately protected. These capabilities may include predicting potential problems and notifications when exceptions occur. You should also consider if monitoring services can be integrated into your service delivery and service performance dashboards.

Storage vendors are responding to modern data protection needs by integrating disk-based backup technologies into their central backup suites. Until this integration is complete, you'll need to implement and manage a multilayer data protection strategy based on the criticality of the app owning the data.


Balancing cost and risk
The following steps will help you identify which applications require different levels of protection, thereby balancing cost and risk.

  1. Know the business requirements for applications with regard to recovery point objective (RPO), recovery time objective (RTO) and retention requirements.


  2. Identify the probability of various risks occurring for each of the above. For example, the probability of discovering latent data corruption two weeks after the fact may be very small, so you may choose to ignore it.


  3. Consult with each business unit about its RPO/RTO requirements and related costs. For example, continuous data protection may satisfy the desired RPO/RTO requirements, but at a cost much higher than the value of the data being protected.

This was first published in September 2008

Dig deeper on Storage Resources

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close