Replicating data overseas for reliability and availability
Bits & Bytes: Evan Marcus discusses the ins and outs, the good and bad of data replication using synchronous and asynchronous methods. He gives a brief explanation of how each works and which one is best for longer distances.
Question:I understand that it's become possible to replicate data overseas. As someone with multiple data centers in the U.S. and Europe, I'm wondering how reliable such a system would be. Can you offer any ideas on what to expect in terms of reliability and availability from an overseas replication solution?
Evan Marcus' response:
It is certainly possible to replicate data overseas with one or two very important caveats. As I've discussed before, there are two (more, actually but we'll just discuss two here) kinds of replication, synchronous and asynchronous.
Under synchronous replication, each data block that is written to the local disk is simultaneously sent across the WAN and written to the remote system and then a confirmation is sent back to the local system. While the data is being transmitted to the remote system and the confirmation is returning, no additional writes can be made to the local disk. As a result, synchronous replication can significantly slow down system performance while the system and its applications wait for data to cross the WAN.
The greater the distance between the local and remote systems the longer the delay ("latency") will be. The rule of thumb is that for distances up to about 75-100 kilometers, synchronous replication's performance is generally OK. Depending on system load, data load, network bandwidth and system performance requirements, your mileage may vary, of course.
Under asynchronous replication data is written to a local log and to the local data store. The log is being constantly drained and the data sent across the WAN to the remote system but this replication is done without delaying other transactions. So, asynchronous replication will still ensure that data makes it to the remote host although the data that reaches the remote host may be some number of transactions behind the data on the local host.
Asynchronous replication is best suited for replication over longer distances including for inter-continental replication.
Note that if you elect to use asynchronous replication not all vendors will guarantee that the data you write locally will arrive at the remote side in the same order in which it was locally written. Be sure you check this capability with your vendor otherwise you may wind up with data on the remote side that is unusable after the local side suffers an outage.
So, in terms of availability, replication is a much better and more reliable solution than backups that are generally the only other way to get data from a local system to a remote. Even though you may lose a small number of transactions under asynchronous replication during an outage, it will just about always be a much smaller loss than if you were using tapes, where all data would be lost back to the last backup.
Of course, the flipside is that replication is a very expensive solution. It can require a very long WAN to connect the two sites and WANs can be very expensive. As a result replication, especially over long distances, is not the right solution for every system and every organization.
Hope this helps. Thanks!
Evan L. Marcus
Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our
This was first published in June 2003