As a solution, someone on our team has suggested a near-online concept, with synchronous replication. Is this advisable? I also need to know if replication to the near-online site can be done at SAN level. How reliable is SAN-to-SAN replication? Is it synchronous? We want a solution where we do not lose the data, but at the same time we don't want to stop the function of the primary site during replication to the DR site.
First off, you need to determine your recovery point objective (RPO) for either your entire environment, or for critical applications and data. By this I mean, how much data you can tolerate losing and not having access to? If you can tolerate loss of 10 minutes of data, then your RPO is 10 minutes. If you can not tolerate any loss of data, then your RPO is zero. So, for example, with an RPO of 20 minutes and using asynchronous mirroring, your data should be at least within 20 minutes of being intact.
Second, you need to figure out your recovery time objective (RTO), which means how quickly you need to have access to your data. Note that your RTO and RPO do not have to be the same and that they may differ by applications. A great source for information about RTO and RPO can be found in Evan Marcus and Hal Stern's book Blueprints for High Availability and Evan Marcus and Paul Massiglia's book The Resilient Enterprise.
Third, ask yourself how far apart your data centers are going to be and what type of fiber optic service will you be using. By type of fiber optic service, I mean is it going to be a dedicated fiber optic cable that you can attach CWDM (Coarse Wave Division Multiplexing), WDM (Wave Division Multiplexing), or DWDM (Dense Wave Division Multiplexing) equipment and self-provision? Will it be a lambda bandwidth service offering from a carrier or other provider that you can allocate to Fibre Channel or Ethernet? Will it be an SONET/SDH OC-x or IP service for shared bandwidth?
I'm a big fan of implementing a multi-tier mirroring strategy in which you use synchronous mirroring for real-time data protection between sites that could be up to 100 km apart (further using specialized equipment, tolerance to latency and getting your vendors to all support it) with minimal performance impact. In the second part of a multi-tier mirroring strategy, you asynchronously mirror data using delta copies to send data hundreds to thousands of kilometers to another site. Keep in mind that when dealing with storage over distance that while bandwidth is important, low latency is essential for data protection. You can also utilize compression and data optimization techniques available from various vendors as standalone products or part of SAN distance extension products like those from Brocade, Ciena, Cisco, CNT, EMC, McData, Netex and Nortel, among others, to help lower your bandwidth costs and improve latency.
The actual data replication and mirroring can be done using host software available from different vendors for both open systems and mainframe environments. The mirroring and replication can also be done from a storage subsystem as well as from new and emerging appliance devices that sit in the data path between servers and storage devices. These devices are also referred to as virtualization appliances, intelligent switches, and many other creative marketing names to differentiate them from their counterparts. Where to do the replication is a personnel preference. The different vendors are more than willing to tell you the pros and cons of each approach. Both server-based and storage subsystem-based approaches are proven and practical each with their own caveats. On paper, using a third party in the data path copy mechanism is interesting, and there are plenty of vendor success s
This was first published in August 2004