Disaster recovery planning: Special report
31 May 2006 | SearchStorage.com
When it's time to implement a disaster recovery plan, storage administrators are faced with the task of protecting important corporate data off site and making adequate preparations to resume normal business operations in the event of an emergency. This may seem like a reasonably straightforward objective, but choosing the right infrastructure can prove unwieldy. A wide proliferation of storage technologies make the choice of platform difficult, and software choices must often accommodate a heterogeneous storage environment that is frequently separated by a large geographic distance. This article outlines many of the tools and technologies that are currently available for disaster recovery (DR) applications and examines some future directions of DR.
Setting recovery objectives
Before any disaster plan can be implemented, it is absolutely critical to establish the recovery requirements. This involves a thorough understanding of your applications and the way they relate to everyday business.
Data classification plays a part in disaster planning. Not all data is created equal -- the company's Oracle database or Exchange platform may be mission-critical, but a repository of PowerPoint presentations may not be essential to the business in the event of disaster. The first step is to identify the applications and data sets that should be protected by a DR plan. Classifying data for DR carries a twist. "Storage classification is more around usage, access and so forth, while DR is more around linkage to business functions, risk mitigation and survivability versus how the storage is being used," says Greg Schulz, founder and senior analyst at StorageIO Group.
Once you understand the applications that need DR protection, you'll need to set a recovery point objective (RPO) and recovery time objective (RTO) for each one. The RPO essentially defines how much data you can afford to lose, while the RTO defines just how much time you have to recover from the disaster. RPO, and to a lesser extent RTO, will influence the technologies and costs involved in your DR plan. "For example, think about American Express having to tell Macy's it just lost the last 1,000 transactions that came out of its stores," says Pierre Dorion, senior business continuity consultant at Mainland Information Systems Inc. "It wouldn't send Amex [American Express] filing for Chapter 11, but it sure would leave them with a serious black eye!"
Making DR work
No two DR plans are identical. Each strategy must be formulated to match unique business needs and legal requirements while accounting for the most likely "disaster" scenarios. While needs and plans can vary dramatically, most draw from the same selection of underlying technologies. A DR plan will often include a mix of several different technologies.
Backup. Traditional backups are still the foundation of many DR plans, leveraging existing tape drives or libraries to copy crucial files to tape media that can then be taken off site to a protected location. Disk-based backups are systematically replacing tape, bringing better speed and reliability to the backup process. Backups can be made to any disk array or other platform, but virtual tape has emerged as a popular alternative when disk technology must emulate tape. Backups can be made locally on site but can also be made to off-site locations across a WAN link.
Mirroring/replication. This disk-to-disk process simply creates a data copy between two disk platforms. When trouble strikes the original copy, data can be restored from the replicated copy -- or possibly even accessed directly from the mirror. As with backups, mirroring between disks can be handled locally within the data center or between disks across a WAN link. Most storage array products provide mirroring/replication software utilities bundled with the product.
Snapshot. A snapshot creates an index of all data locations on the storage platform and tracks any subsequent changes to the data at each location. Over time, a running log of storage changes develops, allowing administrators to restore files and folders that may become lost or corrupted. Since the amount of storage space needed to maintain an index file is limited, it is only possible to store a finite number of snapshots. Consequently, snapshots are typically employed as a short-term protective tactic. Snapshots are basically just a list of data locations and do not back up data to a second location, so it is still important to protect data through periodic backups or replication.
WAN/WAFS. Disaster planning usually demands that data copies be stored in a location separate from the original -- if fire destroys the data center, a viable copy of the data is still available. Today, WAN links are used to backup or restore data directly between remote sites. For example, remote offices might back up their servers to the primary data center each night using fractional T3 WAN links, mobile users can back up their laptops to the data center across the broadband Internet or a data center might replicate its main servers to a backup data center across an OC3 connection.
WAFS is an emerging technology for IT centralization, allowing remote offices "local-like" access to applications and data from the primary data center across a WAN link. WAFS is not directly considered a DR technology, but has an impact on remote office IT consolidation, which will influence DR strategies.
Virtualization. The demand for duplicate equipment can impose a significant cost for DR users. For example, an EMC Corp. Centera in the primary data center may need another EMC Centera at the remote location. Virtualization technology abstracts the hardware, usually allowing data to reside and operate on different hardware at the remote location. This gives organizations the option to deploy used or available hardware, rather than purchasing identical storage hardware for the remote site.
CDP. Continuous data protection maintains a running log of storage changes based on key events or even down to individual I/O operations. This type of continuous operation reduces the RPO to almost zero, and administrators can recover lost or corrupted data from disk in an extremely short timeframe -- keeping RTO short. CDP is not generally intended as a DR technology but is increasingly employed to protect busy networks (e.g. retail or data processing operations) against lost transactions. As with snapshots, data on a CDP platform still needs to be backed up.
CAS. Data that requires only occasional access is sometimes archived to a disk-based content-addressed storage (CAS) device. Unlike ordinary disk platforms, CAS devices typically include security features to prevent file alteration/tampering, compliance features to ensure file availability for a prescribed retention period and data deduplication (a.k.a. "intelligent compression") features that optimize disk space utilization by eliminating redundant data. CAS devices are mainly intended for long-term archival storage but are sometimes used for disk-based backup purposes.
Turning to tools
Any DR strategy has to start with a solid plan, but it's the hardware and software that turn plans into reality. Hardware goes beyond the disks, supporting advanced storage features like mirroring/replication, snapshots or CDP. Hardware also plays a vital role in moving data across a WAN, allowing synchronous or asynchronous data movement and often providing compression and data deduplication capabilities that optimize bandwidth use.
Analysts point to the DMX from EMC as a leading enterprise storage product for DR, citing robust storage and mirroring/replication that handles multisite configurations. The DMX also supports synchronous and asynchronous data movement and multivolume data handling to maintain data integrity between multiple sites. Other leading storage hardware products for DR include the NearStor virtual tape library (VTL) from Network Appliance Inc., and the TagmaStore storage array from Hitachi Data Systems Inc.
Cisco Systems Inc. is another notable DR hardware vendor, providing a variety of network routers and switches that support local, metropolitan and wide-area data movement using technologies like SONET/SDH, DWDM and IP -- important technologies for synchronous and asynchronous data movement. Brocade Communication Systems Inc. also specializes in switching hardware, such as the SilkWorm family of director switches.
Tape still plays a role in DR for many organizations. Products like the L1400M tape library from Sun Microsystems Inc. (StorageTek) and the PX720 tape library from Quantum Corp. handle backup and archiving within the data center and between sites.
Heterogeneous support is increasingly important when selecting storage equipment, ensuring that a new manufacturer's product will interoperate properly with existing products in the data center. In many cases, heterogeneous products can be configured and maintained using existing management software, easing integration problems for new systems. Products are often evaluated in advance to guarantee interoperability.
Backup software and DR
Software tools also deserve serious consideration in any DR plan, bringing organization and management features to the users that must implement and oversee DR operations. Although there are many types of backup software available for the enterprise, the tools used for DR must be particularly easy to use, especially in terms of rapid recovery. DR software must also be flexible, adaptable, interoperable and scalable. In short, it must be able to grow and change as the DR environment evolves.
NetBackup from Symantec Corp. (Veritas) is perhaps the most recognized backup tool for DR applications. Capable of working with disk and tape platforms, NetBackup can operate enterprise-wide backup and recovery tasks regardless of geographic location using a single console under UNIX, Windows, Linux and NetWare. NetWorker from EMC Legato, Tivoli Storage Manager from IBM and Galaxy Backup & Recovery from CommVault Systems Inc. are three other notable backup/recovery products selected for DR tasks.
DR planning software
However, software considerations go beyond backup. Software tools are available to help organizations develop and maintain their DR plan. In most cases, these tools are intended to track hardware in the production and DR environments, ensuring that storage arrays, SAN switches, tape drives, servers and other devices are all available and functioning as expected. TeraCloud Corp.'s Storage Framework and Onaro Inc.'s SANScreen are two notable products that gather and report this type of information. Sentric's Destiny product is an emerging product that classifies and reports data on email servers, file servers and database servers.
Aside from infrastructure tracking tools, there are several tools available that can actually help organizations develop and manage their specific plans. For example, RecoveryPlanner from RecoveryPlanner.com provides DR planners with Web-based software for plan creation, distribution and management over the Internet. It also includes notification services, so responsible individuals can be kept informed throughout the disaster and recovery process. Similarly, Strohl Systems Group Inc. offers its Living Disaster Recovery Planning System software to help users develop comprehensive business continuity plans.
The future of DR
Disaster recovery strategies are shifting away from "backups" and moving toward the concepts of "recovery" and "continuance." The trend toward continuance should continue as organizations set their sights on uninterrupted business operations. Part of this trend is due to the continued evolution of technology in storage platforms like VTL, CDP and snapshots. This is further supported by improved bandwidth utilization using features such as intelligent compression (a.k.a. data deduplication), allowing companies to exchange more data over greater distances in less time. For example, data can be backed up directly to a VTL located hundreds of miles away, then transferred to a tape library for off-site tape storage.
Analysts see the move toward lower costs, better integration and easier management. Emerging products like Bycast Inc.'s StorageGRID are recognized for storing and replicating data between multiple sites. "I believe this is the future of DR in terms of delivering data where and when you need it so that even a geographic disaster is not the same level of concern it is now," says Jerome Wendt, independent storage analyst. Analysts also look to startups to significantly accelerate replication speeds in 2007.
Virtualization is also likely to play a central role in future DR strategies. By adopting virtualization, hardware requirements become less stringent between sites -- with fewer potential configuration and set up problems to impair recovery. "Server virtualization is a huge enabler of DR," says Stephen Foskett, director of strategy services at GlassHouse Technologies Inc. "Many companies are sending VMware images off site so that they don't have to have exactly the same hardware waiting for them on the other end."
Consider the role of third-party DR service providers. Although DR services have not been terribly popular over the last few years, a resurgence may be on the horizon. "SunGard, Iron Mountain Inc. or IBM Global may come back into favor because organizations may not have a second location," says Brian Babineau, analyst at the Enterprise Strategy Group. He suggests that the economics of outside DR services may make sense for smaller enterprises that lack the budget and staff to deploy and maintain an in-house DR system.