This article can also be found in the Premium Editorial Download "Storage magazine: Disaster recovery planning options on a shoestring budget."
Download it now to read this article plus other related content.
My role at the Enterprise Storage Group is to focus on enterprise security, which encompasses storage and all other technology realms. In this role, I've come to an overwhelming conclusion: IT people are definitely spooked by the unprecedented number of perils facing them. Fold these often external dangers in with ongoing compliance tasks (Gramm-Leach-Bliley, HIPAA, Sarbanes-Oxley, etc.) and you have an environment where business continuity has never been a more critical issue. As a result, ESG believes sales of disaster recovery (DR) products and services will grow from approximately $3 billion in 2001 to more than $5 billion by 2007.
Historically, the concept of DR was linked to a number of specialized service providers. These companies provided "hot" and "cold" site services as an insurance policy against rare natural disasters, such as a hurricanes, floods or earthquakes. This protection was not only costly, it usually meant some degree of downtime as service providers set up and emulated corporate systems in remote data centers or in tractor trailers. Again, costs and inconveniences were the price one paid for a worst-case insurance policy.
DR services may not suffice
In today's post-Sept. 11 world, security risks, terror alerts and compliance issues have moved DR into the mainstream. What's more, global 24x7 processing requires uninterrupted business operations. These needs limit the effectiveness of many DR services.
Given these shortcomings, ESG finds more large companies bringing DR in-house for several reasons:
- Business collocation. New York-based firms don't want their backup data centers in DR facilities in Arizona or Florida. By owning its own facilities, a New York company, for example, would have abundant options in preferred, nearby locations such as New Jersey or Brooklyn.
- Systems are available all the time. Many DR service providers limit the number of tests to two per year. This isn't adequate when IT is constantly changing configurations to meet business requirements. By taking DR back in-house, companies can test as often as necessary to ensure their readiness for a real emergency.
- Architectural flexibility. The traditional concept of a redundant DR site is rapidly becoming obsolete. Geographic clustering, grid computing and wide-area storage area networks (SANs) mean companies that own their DR infrastructure can use that equipment for everyday business processing while maintaining protection.
- What is the acceptable data loss? This goes back to the RTOs and RPOs mentioned earlier. Generally, if a company needs to mirror transactions with minimal data loss, it must use synchronous replication tools generally provided by disk drive manufacturers such as EMC, Hitachi Data Systems and IBM. Less-restrictive data loss requirements widen the field to a potpourri of asynchronous options from software companies like NSI, Topio and Veritas.
- What choices are presented by the telecommunications infrastructure? It may seem obvious that existing data centers should be connected, but if there is no fiber infrastructure in place, network construction costs alone could run into the eight-figure territory. Given this restriction, it's best to start by examining the existing telco infrastructure and related real-estate options.
- Which natural factors should be considered? Often times, the biggest obstacle is Mother Nature. For example, with the multiple storms ravaging Florida in 2004, local companies should probably reconsider locating their DR sites nearby. San Francisco-based companies, on the other hand, typically replicate their data to Salt Lake City or Phoenix, rather than just across the bay to Oakland, because the two California cities sit on some of the same fault lines that cause the area's occasional earthquakes.
Once site locations have been selected, companies must also consider the many IT challenges involved in DR. Too often, DR is equated with mirroring or replicating storage, but DR crosses many functional IT boundaries, which can lead to organizational confusion. For example, while the storage team talks about Fibre Channel (FC), ESCON and FICON, the networking group is off in its own world of IP, Ethernet and VPNs, while telecommunications experts dwell on PBX, SONET and DS-3s.
The diverse spheres of interest and ensuing language differences can lead to costly mistakes. At one company, the storage team wanted to extend an FC-SAN across geographies and asked the networking/telco team to procure the necessary pipes from its carrier. When the networking guys looked at the requirement for 2Gb/sec of bandwidth, they thought this must be a typo and ordered a standard T-3 (45Mb/sec) to cover the storage requirement along with a few others. This gaff led to a few shocking realizations. The networking people learned that storage applications eat up a whole mess of bandwidth, while the storage team found that extending storage over distances requires very expensive optical circuits (in this case a 2.488Gb/sec OC-48).
Steps toward in-house DR
There are significant benefits to bringing DR in-house, but the advantages come with demanding technical and IT considerations. When implementing an in-house DR plan, companies should:
- Start by defining business requirements. This seems obvious, but you'd be surprised how often the business side is minimized in the planning process. Corporate officers, line-of-business managers, major customers and suppliers should be involved up front. Complement their input by educating business folks on the efforts involved and related costs. Any line-of-business manager worth their salt will want protection--but not if it costs $1 million to protect $500,000 of revenue.
- Build a cross-functional IT team to manage the DR project. Recruit top applications, systems, networking, storage and telecommunications people to drive the DR project. Keep in mind that the DR project is never really done, so these folks will have to remain together to modify and test the architecture over time.
- Test everything frequently. Police, fire and EMT personnel train constantly so they can respond appropriately in situations where emotions run high and time is precious. Treat your DR processes the same way with frequent practice drills, evaluations and improvements.
- Understand the limitations of the carriers. Your carrier should be on the hook to keep the network available, so protect yourself with strict service-level agreements. Realize, however, that if a problem is not in the network, you own it.
- Monitor the DR path on an end-to-end basis. Too many companies manage the DR architecture as a series of components, such as servers, storage and networking equipment. Remember that all this gear works as a business continuance system to protect the company in the event of a failure. To defend against a problem while minimizing operational headaches, monitor and manage all the components of the DR path through a common set of tools, personnel and processes.
Sales of DR products and services are spiraling upwards because of an environment that features escalating threats and round-the-clock IT requirements. Through careful planning, defined processes and technical implementation, ESG believes most enterprise companies are best served by rolling their own when it comes to DR architecture. Smart firms will leverage this new infrastructure to improve flexibility, streamline costs and provide the best possible protection for critical business operations.
This was first published in November 2004