Second-generation CDP


This article can also be found in the Premium Editorial Download "Storage magazine: CDP 2.0: Finding success with the latest continuous data protection tools."

Download it now to read this article plus other related content.

Recording changes
With the exception of EMC RecoverPoint, which analyzes I/Os within a Fibre Channel (FC) fabric without the need to install agents on protected systems, host-side agents are by far the most prevalent method of detecting and capturing data changes. Host-side agents implement so-called filter drivers at the file-system or volume level, which are invoked by the OS whenever changes are saved. Depending on the implementation, data changes are then replicated from the protected host to a CDP repository in real-time or at defined intervals.

CDP implementations are split almost evenly between file-level and volume filter drivers, and each approach has its pros and cons. Volume filter drivers have the advantage of being file-system agnostic, which simplifies the support of multiple platforms. The majority of vendors with significant platform support beyond Windows, such as EMC RecoverPoint (which supports fabric and volume filter driver), InMage DR-Scout and Symantec Veritas NetBackup RealTime Protection, all opted for volume filter drivers. Moreover, vendors with volume filter driver implementations tout it as less complex. "All we have to know is volume information, whereas products sitting on top of the file system ... need to track many more file attributes," says InMage Systems' Atluri.

On the other hand, the ability to capture whatever file attributes are

Requires Free Membership to View

required enables vendors like Asempra to implement unique features that are impossible to match by volume filter driver implementations. "The fact that we know everything about a file set enables us to virtualize data sets, which we can present to applications like Exchange Server just seconds after a failover," explains Gary Gysin, president and CEO at Asempra. In other words, Asempra Business Continuity Server can present all relevant file-system meta data to users and applications right away while the actual data is restored in the background, greatly reducing the time between a failover and when files and applications can be used.

The CDP repository
The second critical CDP key component is the CDP repository, which typically stores two types of data: A replica of the protected data and a log of all changes for a defined period of time. Whenever a change is sent to the CDP repository, it's applied to the replica, synchronizing it with the protected source and making the replica usable as a production image in the case of failover. This is very similar to data protection via array-based replication. But in contrast to replication-based data protection, all replicated changes are also stored in a change log or change journal, which tracks every change.

In case a file or application needs to be restored to a previous point in time, changes are reversed by traversing the change journal. Because the CDP repository stores a synchronized copy of the protected data as well as a list of changes, its size must be the size of the protected data plus the space needed for the change log. The size of the change journal depends on the number of days for which any-point-in-time recovery is required, as well as the number of changes. For frequently changing data, the change journal grows more rapidly. For instance, if 20% of the size of the replica is reserved for the change log, a database environment with 20% changes a day would allow a rollback of one day. "We recommend 24 to 72 hours CDP recovery and defer to other disk-based or tape-based restore methods if data beyond this period needs to be recovered," says Rick Walsworth, EMC's director of product marketing.

A critical aspect of any CDP evaluation is the mechanism a CDP product has in place to fail back to the production system after a failover. The failback has to be simple and provisions need to be put in place to prevent data and transaction loss. The methods vendors put in place vary in implementation, ease of use and capabilities. For instance, InMage Systems puts an agent on protected and failover servers. "By having an agent on the failover server, the failover server simply starts replicating to the CDP repository when a failover occurs," says the firm's Atluri.

This was first published in October 2008

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: