What are your backup, DR and data retention policies?

Can you describe your backup, retention and disaster recovery policies? And do they correctly address your business needs?

When i ask people about their backup, data retention and disaster recovery policies, most people fall into one of three groups.

The first group responds by telling me about how often they do fulls and incrementals, how long they keep the tapes and how often they send them off site. The second group is only a little more sophisticated: They have one answer for databases and another for files.

Only the third group is sophisticated enough to ask me what I mean: Am I referring to backup or archive? Which service level am I referring to? And so on.

The first two groups are guilty of focusing on the mechanics of their backup environment--dealing primarily with operational concerns, such as tape management. They share two common attributes:

  • All data is essentially treated the same. There's little consideration given to the value of data to the organization or to customer requirements for service levels.
  • Only the minimal number of policy attributes has been taken into account. Other backup policy dimensions have largely been ignored.

In contrast, the third group is more likely to be addressing business needs. They consider a variety of policy attributes. They have classified data and developed policies that are appropriate to each level of classification. They have formal service level agreements (SLAs) between the IT organization and its users. This is the direction toward which many organizations would like to head.

If the first two scenarios remind you of your organization, here are a few things that you can do to begin to evolve to a model more like that of the third group.

How capable are your backup policies?

Performance capacity Measured as deliverable bandwidth, both in terms of single stream and aggregate capacity. Includes individual tape drive performance, aggregate tape drive performance, disk, hosts, networks, etc.
Data volume capacity Total library capacity
Media options Options available within the environment, including disk, optical and tap
Replication options Intended to assist in reducing RTO/RPO times
Off-site options Intended to address DR requirements
Tape management capabilities Considerations include tape duplication capabilities, options for frequency of backups, etc.
Operational capabilities Staffing considerations, notification and reporting tools available, etc.
Costs Budgetary constraints

What really matters?

An optimized backup environment delivers services that are aligned with the business requirements of the organization. This requires having both the capabilities and resources available to meet these service demands, as well as the policies and processes in place to apply these resources where appropriate.

In backup environments, an all-too-common approach is to define a set of policies based primarily on two attributes: retention and the frequency of full backups. Often, decisions about these aspects are based not on an understanding of business need, but rather on IT operational standards or a vague sense of what is generally believed to be acceptable.

That approach may not adequately meet the data protection needs of the organization. It would be far more useful to develop an approach to backup policy that ensures that resources and capabilities match business needs. Here's how to begin.

First, develop service objectives, a set of attributes around which a service is defined. An SLA is a written agreement between IT and a user for providing a set of services that have been defined by one or more service objectives.

Clearly, backup frequency and retention are important backup service objectives, but there are others (see "What's your objective for backup?" on this page) that merit some additional comments. Although "backup" is commonly used to refer to the entire backup/restore operation, users don't care about backup at all. Their real concern is restorability. Two attributes go to the heart of that concern: recovery time objective (RTO) and recovery point objective (RPO). Respectively, they answer the questions: "How fast can data be restored?" and "How current must that data be?"

Focusing on frequency and retention doesn't address those concerns. To establish RTO and RPO, gather and define requirements for your key applications. How well is the value of data understood within the organization? Many organizations are attempting to establish tiers of primary storage in which the most important data is placed on Tier One storage, and less important data place on more economical, Tier Two storage devices. The definition of RTO and RPO requirements for data can be thought of as one of the more important elements of this data classification process.

Another increasingly important service objective relates to historical data retention, also known as archiving. Archiving is different from the retention and expiration parameters of backup data. It's a point-in-time snapshot of selected data that is intended to be stored and retrievable for a significant period of time. Archiving is more about content than format, so it's often a completely separate process from backup, one in which selected content is dumped into a standard format for safekeeping. The renewed concern about regulatory compliance is driving a re-examination and modification of archive policies and procedures.

What's your objective for backup?
Backup frequency How often a file is backed up. Also related to the cycle of full/incremental backups
Retention period The length of time that a particular version of a file is available to be restored
Backup window The period of time available each day to complete backups
Recovery time objective (RTO) The acceptable amount of time from the start of the data recovery process until its completion
Recovery point objective (RPO) The acceptable time variance between the current time and the age of the data available to be restored
Off-site/Disaster recovery requirements Additional recovery requirements and considerations for disaster recovery scenarios, i.e., data dependencies, additional coordination requirements,etc.
Archival requirements Long-term data retention of historical data and related management requirements. Driven by business and regulatory demands
Special media considerations Additional requirements regarding media type, validation, refresh, etc.

Know thyself

In addition to understanding and establishing service objectives, you must understand internal service capabilities and limitations. The ability to provide effective backup services is dependent on a number of elements. (See "How capable are your backup policies?")

The goal of analyzing your backup capabilities is to establish a set of metrics that help to define the level of services that can be delivered. Typical questions you must answer are:

  • How much data can be backed up per hour?
  • What's the total online capacity?
  • How long does it take to duplicate tapes for off-site storage?
  • What's the availability of the appropriate levels of expertise to provide these services?

In many organizations, replication capabilities are viewed as independent of backup. While replication is a different set of operational processes from backup, both are really points on the same continuum of RTO and RPO capabilities when viewed from the restorability point of view.

Can you deliver?

The next step in optimizing the backup environment is rationalizing service objectives and capabilities and developing formal service offerings. It's critical to promise only what can be delivered. If the RTO of a key application with 500GB is one hour, and you can only recover at the rate of 18GB per hour, you have a serious problem (but I'm sure you already knew that).

You have to balance what's feasible, what's desired and what it costs. This is where having a clearly understood set of metrics becomes essential. If you can't meet requirements for data recovery with current resources, then either the requirements must be modified or the resources expanded. Having the data available to present to senior management and to users is the key to negotiating realistic service level agreements.

Having the right backup policies is only one part of the much-larger issue of storage management. Everything that has been discussed about service objectives and service capabilities can be applied to other areas of storage. Developing backup policies is part of a process of shifting focus from the inward perspective of storage management to an outward view focused on the needs of the business. It's possible to be good operationally (regular backups completed on time, etc.), but bad from a business point of view (unable to restore key data in the time required). Reviewing policies is a good place to begin coming to terms with that problem.

Dig Deeper on Data storage compliance and regulations