Continuous data protection (CDP) has great potential benefits, but it shouldn't be viewed as an isolated technology widget. Rather, CDP should be treated as a little piece of a much more profound process and business change.

CDP is just one piece of the puzzle

The benefits of continuous data protection are obvious for backup, but CDP can also play a key role in overall data management.

WHEN IT COMES to business computing, there's been a tremendous amount of progress in the last few decades. Processing power and storage capacities that used to cost millions of dollars can now be purchased at the local Best Buy, CompUSA or Fry's Electronics for a mere pittance.

But some things never change. In spite of these wonderful advances, backup/restore issues continue to dog the industry. According to Enterprise Strategy Group (ESG) research, approximately one-quarter of companies say that 20% or more of their tape-based backups and restores fail because of issues like media malfunctions, human error and software failures. Can't some intelligent vendor throw some new technology at this age-old problem?

Old problem, new solution
Technology vendors haven't been asleep at the wheel; witness innovations like remote mirroring, journaling file systems and disk-to-disk backup. Over the past few years, another new technology, continuous data protection (CDP), has come upon the scene and promises to improve data protection and today's archaic backup/restore processes. There might still be some confusion about a precise definition of CDP, but ESG defines it as a "software- or appliance-based solution designed to capture each and every write to primary storage and then make a time-stamped mirrored copy on a secondary device. The objective of CDP is rapid data recreation and restoration--with tremendous granularity--as it existed at any previous point in time."

In some ways, CDP can be viewed as a "poor man's remote mirroring." Rather than capture and copy transactions on its own, CDP records all data changes for a predefined time interval and then ships these batch files to another storage system for safety. Unlike remote mirroring, some exposure remains, but the overall risk of data loss is greatly reduced. If a CDP application takes a snapshot of the data every 15 minutes, the worst-case scenario is that an organization could lose 14 minutes and 59 seconds worth of data. In a traditional backup environment, as much as eight to 12 hours' worth of data could go AWOL in the event of a technology failure or disaster. This is a significant improvement--15 minutes of lost data is equivalent to approximately 2% of 12 hours of lost data.

Obviously, backup needs to be improved, but do we really need to capture a snapshot of our data four times an hour or more? As basketball announcer Marv Albert might say: "Yes!" Organizations are clamoring for better backup/restore options because:

  • THE BUSINESS DEMANDS HIGH AVAILABILITY. To support electronic-based business processes and global operations, companies must ensure that their systems are up and running around the clock. If systems fail for even a short time, the business will likely suffer a financial impact. According to ESG research, 31% of organizations say they'll experience significant revenue loss or other adverse business impacts as a result of one hour or less of application downtime; another 58% said they'd suffer financially in the event of an application downtime of four hours or less. This data serves as quantitative proof of the old adage, "Time is money."

  • OTHER DATA-PROTECTION SOLUTIONS ARE RESERVED FOR THE DIGITAL ELITE. Many large organizations would love to mirror transactions for up-to-the-minute protection, but the economics don't add up. Remote mirroring demands complex and costly storage--a software and networking infrastructure beyond the reach of all but the largest enterprises. CDP may help bring remote mirroring-like benefits to the masses without requiring many new technology investments.

  • USERS WANT PROTECTION AND ONLINE ACCESS. Over the past few years, ESG has seen a precipitous rise in the popularity of disk-to-disk backup. Some of this growth is certainly related to the need for faster backups, but disk-based data also supports faster--and easier--restores. This is especially important because a little more than one-third of users say that most restore operations are related to data that's less than 24 hours old. When the CEO "fat fingers" his laptop PC and loses a file, the combination of disk-to-disk backup and CDP will remedy the problem far faster than traditional tape-based restores.
With its many benefits, CDP can be considered a "killer app" for information lifecycle management (ILM). Storage managers can use CDP to constantly feed data to a tiered storage infrastructure. The scenario might play out something like this: The first 24 hours of data can remain on Tier- 1 storage platforms. Data that's two days to 10 days old can be moved to midrange or Tier-2 storage platforms. Older data that's between 11 days and 60 days old can be placed on high-capacity Tier-3 storage, while even older data can be migrated to tape. In this way, CDP-based tiering can marry data protection and automation with capital cost savings.

CDP is just part of the process
CDP has great potential benefits, but it shouldn't be viewed as an isolated technology widget. Rather, CDP should be treated as a little piece of a much more profound process and business change. As such, storage managers should look at CDP in a holistic manner, using it as a springboard for addressing the following:

  • DATA CLASSIFICATION AND TRACKING. ESG has been monitoring data classification since 2003 and has seen little progress in this area. Why? Classifying massive amounts of data is too manually intensive and time-consuming for most shops. These limitations aren't going away, but other technologies like CDP may persuade weary IT managers to make the effort. Data classification and tracking will uncover the most exposed critical information and where it lives. Once this is accomplished, CDP benefits can be instantaneous.

  • HIGHLY AVAILABLE INFRASTRUCTURE. CDP will copy changing data frequently, but what if servers, storage or networks are unavailable? As a follow-up to data classification, IT managers must also assess infrastructure assets to uncover any single points of failure. Is critical data supported by multipathing RAID arrays and server clusters? Are remote data centers available over multiple network routes? These kinds of infrastructure assessments may seem obvious, but ESG finds that sometimes the most apparent items are also those most likely to be ignored.

  • APPLICATION-BASED RECOVERY PROCESSES. Storage risk-assessment specialists often point to a common problem with disaster recovery. While users tend to test their disaster recovery processes at the system level, they fail to test across a multitiered application environment. This can introduce problems when it comes to the timing and complex interrelationships inherent in restoring business applications and databases. Smart storage managers will use CDP as a blank sheet of paper to improve disaster recovery readiness. This can be accomplished by starting with the data and then working up to mapping application topologies. Finally, application-recovery processes should be fully tested to ensure that planning assumptions are correct.

  • ILM SECURITY. While CDP and storage tiering are clearly beneficial, they introduce some new risks because online storage is easier to break into than tapes collecting dust in an offsite vault. Make sure to support CDP and storage tiering with secure configurations, role-based administration, an appropriate number of access controls, as well as reporting and auditing.
Bottom line
CDP is by no means a panacea. It must be supported with the right process, and organizational and technological changes to reach its full potential. That said, the potential operational and capital benefits from a combination of CDP, ILM and disk-to-disk backup are impressive. Savvy storage managers will bring their vendors in to discuss their roadmaps for CDP and to work out an implementation strategy. Be sure to cast a wide net, because storage bigwigs and startups alike offer some impressive options for CDP.

This was first published in March 2006

