What you will learn from this tip: Remote replication isn't cheap. Getting the biggest bang for your buck from replication requires careful management. This tip outlines four essential remote replication
The key to successful remote replication is management. That's true of just about any data protection system, but it is even truer when critical data is being replicated and stored remotely.
One reason is cost. Remote replication is not a low-cost solution, and the gold standard -- synchronous replication -- can be extremely expensive. Although the price of bandwidth and hardware has fallen significantly, remote replication is almost always more expensive than on-site backup (the possible exception occurs when you're protecting data at a remote office using a remote backup service, like Iron Mountain Inc.'s LiveVault). In addition to the initial cost, there is the ongoing cost of maintaining the remote site and continuing cost of the bandwidth you use. To minimize these costs, you need to manage your remote replication system carefully.
Another reason is choice. Remote replication vendors offer a wide (not to say "bewildering") variety of choices for everything from architecture to options. Products from companies like FalconStor Software Inc., Topio Inc. (now part of Network Appliance Inc. [NetApp]) and Kashya Inc. (now part of EMC Corp.) offer replication from the network. Other companies, such as Hitachi Data Systems Inc. (HDS) handle replication from the storage devices. Lastly, vendors like Veritas (now part of Symantec Corp.) use the server to handle replication. Within each of these approaches there is a huge range of features and options. The products you choose, and how you manage them, will have a major impact on your costs.
A third reason is mission. Beside disaster recovery, remote replication is being used for data consolidation, data warehousing and other purposes. Even disaster recovery is increasingly fragmented. On one hand, there's disaster recovery (getting up and running again within specified time) and on the other business continuance (automatically cutting over to the remote site so you don't go down at all). Replication needs to be managed with an eye on the reason for replicating the data. Providing business continuance protection to a vital OLTP database requires a very different management approach than consolidating data from several branch offices so it can be loaded into a data warehouse. Consider the following four tips when evaluating remote replication:
Manage your expectations
When most people hear "remote replication" they think of synchronous replication of everything. Every write is mirrored to the remote site as it happens and all the data is replicated.
While you can do it that way, you almost certainly won't. Having remote replication handle all your backup needs is probably going to be too expensive. Typically, remote replication will be limited to the absolutely critical parts of your system and day-to-day backup is handled locally.
This raises the question of recovery time objectives (RTO). The bedrock of a successful data protection policy of any sort is setting appropriate RTOs and backing them with the financial and personnel resources to obtain them. Generally, the closer you get to high-availability synchronous remote replication, the more expensive the process becomes. For example, if you use replication appliances, you will typically need four of them and two communication channels to achieve your goals. Also, in general, the more flexible you are in RTO, the less expensive remote replication becomes. In addition to lower costs of hardware and software, bandwidth costs can be cut considerably by being willing to queue the data to be replicated locally to even out the load on the channel.
Manage your data
How much data you need to replicate depends on your objectives. Typically, only a fraction of a site's data needs the protection of remote replication. This is especially true for disaster recovery.
Modern replication systems typically offer a broad range of choices as to what is replicated. Typically, you can set up one or more replication groups containing just the files or directories you need. In asynchronous systems, many products allow you to assign priorities to those groups, so the most important data will be replicated first.
If you only need to replicate certain application-specific data, such as an Oracle or Exchange database, you may already have the capability. Both Oracle and Exchange come with features that allow them to remotely replicate a database. However, those features are often limited, and you may want a more sophisticated solution, especially if bandwidth is a potential bottleneck. One particular problem with application-only approaches is that there is often related data that isn't part of the application, but is necessary to successfully restore the application. Typically, the application-only software won't replicate that.
In many cases, the preferable solution is a mixed approach in which only the critical parts of the enterprise data are remotely replicated and the rest is backed up locally.
Manage your bandwidth
Bandwidth concerns are perhaps the biggest difference between local and remote replication. While no backup system can afford to completely ignore backup bandwidth, it is usually less of an issue locally than remotely. Since remote bandwidth is expensive, and an overloaded channel can throw your entire replication process out of whack, it is important to use it wisely.
Many, perhaps most, remote replication products now operate at the block level with varying degrees of sophistication. Some remote replication systems, such as Kashya's, will control bandwidth usage and give higher priority to replicating the data you designate as more important. In Kashya's system, the user makes either speed or bandwidth a priority, and the system manages priorities accordingly.
Manage your process
First, managing the process means keeping an eye on end-to-end replication performance. Use monitoring utilities, like those built into your network or operating system, to make sure performance is not deteriorating, and set alarm levels appropriately so you will be notified if a problem is developing.
Second, managing the process means running regular test restores to make sure the data can be restored smoothly. Here again, replication systems typically provide a number of features to help you perform test restores. Sun Microsystem Inc.'s StorEdge Data Replicator, allows you to do a test restore to another target, so you can test without having to disturb your production environment.
About the author: Rick Cook specializes in writing about issues related to storage and storage management.
This was first published in November 2006