Referencing his company's penchant for closing deals and realizing revenue objectives just before quarterly financial results need to be reported to the SEC, Cisco Systems CEO John Chambers once quipped that his personnel were becoming "masters of the diving catch." The metaphor, derived from baseball, must resonate with anyone familiar with the execution of disaster recovery plans.
Increasingly, the successful recovery of critical business processes in the wake of an unplanned interruption is tantamount to an outfielder making a circus catch of a fly ball with an erratic and unpredictable trajectory. One big problem -- often the biggest -- is storage recovery.
Most companies have a lot of data already and are growing their data stores at a fast clip. While a few have consolidated data onto centralized platforms sporting the colors of a single vendor, most have been keen to avoid vendor lock-in or have purchased storage on an application-by-application basis, selecting what they hope is the best storage platform for the job. (This kind of free agency is anathema to the EMCs, IBMs, and Network Appliances of the world. They'll tell you over drinks that it's ruining the game.)
Platform diversification may also reflect knee-jerk acquisitions rather than any strategic thinking. Each time that a dreaded "disk full" error message is encountered, the IT manager simply goes out to the market and buys whatever the trade rags or pundits are recommending
Whatever the reason, diversification of storage platforms represents a serious challenge to effective storage recovery in the wake of a disaster. The reason is simple: timely recovery requires the rapid re-hosting of data at a location outside the disaster zone. Options for recovering data that is normally hosted on a diversity of platforms are as follows:
1. Replicate all production storage platforms on a one-for-one basis at the remote facility. Typical drawbacks of this strategy are high cost and, in most cases, poor data restore timeframes. Usually, you need a set of tape backups for each platform. This may entail many backup/restore software products, tape formats and drive types and all of these requirements must also be replicated at the remote site. Also, you should be aware of the fact that, while some tape vendors tout the standardization of tape formats and drives as a harbinger of interoperability, the truth is that tapes recorded on a standard drive at one location may not be readable by another drive of the same type at the remote location.
2. Plan to consolidate data onto a new hosting platform at the recovery site. This strategy attacks the cost of the 1-for-1 replication strategy, but opens another can of worms. Specifically, how efficiently can data that is normally maintained on separate platforms be re-hosted onto a single, more capacious platform? Again, if the data is backed up to different tape formats using different backup/restore software, each of these components will need to be replicated at the remote site. Additionally, considerable work may be required to pre-configure the consolidated hosting environment and the servers that will access it so that data is placed into the appropriate partitions (or zones in the case of a Fibre Channel fabric) and made accessible to the applications that use it. Since time is of the essence, preplanning and extensive testing of this alternative is a must.
3. Vault or mirror your data. In theory, remote data vaulting/mirroring addresses both of the disadvantages of the options described above. Data is backed up to a remote tape library (tape vaulting) or replicated onto a consolidated disk array or fabric (disk-to-disk mirroring) to reduce equipment duplication requirements and to expedite data recovery. This option, however, carries with it significant costs related to networking interconnects between production storage and the remote vault/mirror. If nets lack sufficient bandwidth to perform data transfers in real time without impacting production application performance, intermediary storage platforms such as caching appliances may be required. Services have appeared (and disappeared) over the last several years to facilitate this approach.
The bottom line is that none of the current solutions are a silver bullet. Each entails costs and time-to-data considerations that must be carefully weighed to identify the best strategy.
Large array vendors often capitalize on the above complexities of cost and time-to-data considerations. They then argue for the advantages of data consolidation using a few large platforms in the production environment. Such vendors maintain that this type of consolidation simplifies recovery by facilitating remote mirroring and eliminating the need for tape in a recovery setting.
Of course, such arguments are somewhat self-serving. The fact is that the utilities offered by large array vendors for use in remote data replication typically are limited to data exchanges between two of their own platforms. For example, EMC's SRDF -- which is a fine product for array-to-array replication -- only works if you are using Symmetrix in both the production environment and in your recovery environment. If you have Symmetrix at site A and IBM Shark or HDS Lightning at site B, neither SRDF or comparable offerings from the other vendors can be used.
Some work is underway in the industry on cross-platform data re-hosting software. Until products are proven in actual recoveries, tape backups and human skill will be required to make the "diving catch" that will recover data and restore business processes in a timely way following a disaster.
Spring is coming. It's time to oil up your glove and get back on the practice field if you want to stay in the game.
About the author: Jon William Toigo has authored hundreds of articles on storage and technology and authors the monthly searchStorage "Toigo's Take on Storage" expert column. He is also a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including, "Disaster Recovery Planning: Strategies for Protecting Critical Information Assets, 2/e" and "The Holy Grail of Data Storage Management."
This was first published in February 2002