When Hurricane Ike struck last summer, many feared a repeat of the 2005 Katrina catastrophe. While Ike turned out to be one of the most destructive hurricanes on record, its impact was nowhere near as devastating as Katrina's. This was due, In part, to better preparedness across the board.
From an IT perspective, Katrina raised the level of consciousness regarding disaster recovery (DR) and spurred more organizations to invest in "recoverability."
However, building a DR capability, particularly in the current economic climate, can still be a tough sell. The business justification for DR is based primarily on risk avoidance -- a so-called soft metric -- rather than on hard cost savings. In addition, DR implementations often involve an investment in infrastructure that mostly sits idle waiting for that fateful day. As a result, organizations may be tempted to shelve DR initiatives or re-prioritize them further down the IT project stack.
Unfortunately, disasters don't distinguish between good and bad economic times. But you could easily argue that preparedness in tough times is even more critical as institutions are less able to tolerate instability.
Disaster recovery is an insurance policy and, as with any other type of insurance, we dislike the possibility of paying for something we may never use. Therefore, when formulating a DR strategy, it's important to minimize unnecessary purchases and maximize utilization of DR assets wherever possible. The first step to attaining these goals is a solid understanding of business requirements with respect to DR. Over-designing and over-delivering beyond actual needs meets service-level requirements, but it's inefficient and won't help make your case for future DR dollars. Likewise, short-changing the process to save money is often a good way to get caught off guard when disaster does strike.
When it comes to storage-related DR services, many organizations fall into one of two categories:
- One size fits all. There's a single service option, which is often tape-based recovery. For mid- to large-sized organizations, it's highly unlikely that all application needs can be adequately met with a single solution; it's too much for some and not enough for others.
- Two sizes fit all. In addition to tape, a data replication option exists. While this may be adequate in some cases, we've encountered situations where the gap between the two services levels -- tape-based recovery vs. synchronous replication, for example -- is simply too large. This forces a choice between a very high cost option that more than meets requirements or a much less-expensive one that falls short.
It's important for an organization to develop a catalog of DR services that can address the required range of recovery time objective (RTO) and recovery point objective (RPO) requirements at appropriate cost differentials to avoid over-delivery.
This top-down approach of gathering business requirements to formulate a strategy must then be tempered with a bottom-up understanding of the "step-function" cost implications of various protection options. The iterative process of testing assumptions and validating business needs against the likely cost to meet those needs results in a more realistic stratification of requirements from which we can develop a DR service catalog that aligns efficiently with the business.
The many forms of replication
The key to quick data recovery is replication, and while the gold standard continues to be synchronous replication, the available alternatives have become quite extensive. This allows for a wide range of options at a variety of price points.
Enterprise arrays tend to offer the broadest range of options, including synchronous and asynchronous replication, as well as potentially critical features like consistency groups and various types of multi-hop replication. As these systems now also support a growing range of disk types (from solid state to Fibre Channel to SATA), it's conceivable that a multitier DR solution can be configured within a single storage platform.
More commonly, organizations look to midrange storage platforms for lower cost replication options. The challenge here is that the replication capabilities of these systems can vary significantly, from vendors that offer essentially the same or nearly the same functionality as on their tier 1 systems to those that offer only basic replication.
In environments with more limited needs -- a handful of key applications requiring replication, for example -- host or application-based replication can often be leveraged at an even lower price. These approaches, including database log shipping, volume- or file-based replication, and application-specific replication tools, can provide a very cost-effective way to meet requirements as long as the number of applications and servers remains manageable. But configuring and monitoring a large number of such systems can increase complexity, and a range of software solutions may be required to support different applications. At that point, people typically look to a broader option, typically at the storage level.
Reducing DR costs
Server virtualization promises DR many benefits and is causing organizations to rethink their overall DR approach. One of the most significant cost-related impacts relates to the problem of idle systems. Not only can virtualization dramatically reduce the number of physical servers required for a DR site but, due to the very nature of virtualization, these servers may actually be leveraged for daily operations. For example, DR systems may be deployed under normal circumstances for test and quality assurance, but take on the alternate (or additional) role of hosting production virtual machines failed over from the primary site in a DR scenario. When trying to justify DR costs, non-idle, multifunction assets can be the difference between a winning and losing business case.
Given our fascination with virtualization, it's only reasonable to look there for potentially more affordable storage options. The promise of virtualization at the SAN fabric level has yet to be realized on a large scale. One service capability offered by such technology is replication and it can be done heterogeneously between different types of arrays. This opens the possibility of replicating to a less-expensive array.
Beyond replication, disk-based backup can also play an important role in a multitiered DR strategy. Virtual tape libraries with deduplication and replication capabilities can provide a level of service below primary storage replication but higher than tape-based recovery. Because data is deduplicated, the bandwidth requirements are usually less; by replicating to a similar platform at the DR site, recovery through restore can be significantly quicker than tape.
A recent global DR study by Symantec Corp. of 1,000 IT managers and C-level executives found that 56% of applications are now classified as mission critical, which is up from 36% in 2007. This has serious implications for the IT infrastructure in that it indicates a continuing rise in business demands and expectations. Meeting these demands in a time of increased budget constraints requires the careful application of the appropriate technologies to satisfy requirements without over-delivering. Understanding not only the potential benefits but the operational implications for these additional options is, of course, essential before heading down a given path. But it's clear that meeting current and future demands will require such a multifaceted approach.
BIO: Jim Damoulakis is CTO at GlassHouse Technologies, a leading independent provider of storage and infrastructure services. He can be reached at firstname.lastname@example.org.