Published: 26 Sep 2008
| As continuous data protection (CDP) is integrated into backup environments, implementations with other backup protection programs may become complex.
continuous data protection (CDP) software is primed to tackle enterprise high-availability requirements. As companies seek to eliminate or drastically reduce backup windows and to create faster point-in-time recoveries for more mission-critical apps, CDP software has become a viable alternative to traditional backup software and storage system-based replication software. But with multiple points where CDP software can be integrated in the storage infrastructure, and the emerging linkage of CDP with backup applications, organizations may need to tackle lengthy and potentially complex CDP implementations. This article focuses mainly on CDP products from vendors who sell CDP and backup software.
As CDP becomes a viable option for enterprise servers, organizations need to decide how extensive their deployment will be as it may require the use of one, two or even three CDP products to protect all desktops and servers. The makeup of the CDP host agent is a tip-off to the scope of data protection the CDP software can provide. File-system-based CDP host agents take advantage of existing TCP/IP networks and can recover data at the file-system or volume level, but may cause excessive server and network overhead in high-transaction environments.
CDP products with block-based agents minimize overhead by capturing changes at the volume level and transmitting all changes over IP or FC SANs. Block-based CDP agents can scale to meet the performance requirements of most mission-critical applications, but can introduce significant cost and complexity.
Selecting the right CDP product architecture for enterprise servers often comes down to whether the server is connected to a network or an FC SAN. If connected to an FC SAN, you need to determine if there are sufficient changes to app data to justify the deployment of an FC SAN-attached CDP appliance.
CDP products for network-attached servers come in many different architectures. CA XOsoft, for example, installs at the file-system level and is configurable as a standalone product and as a replication target on a secondary server.
VMware users may want to consider FalconStor's CDP Virtual Appliance for VMware. Although both CA XOsoft and FalconStor Software support VMware and the installation of host CDP agents on guest OSes, FalconStor installs its host CDP agent at the block level. The agent then stores the changed data to a volume presented over the corporate IP network by the FalconStor CDP Appliance. Once the CDP software creates a copy of the data on that virtual volume, the administrator may break off that virtual CDP LUN from the source and present it to any other operating system on that ESX server for recoveries or testing and development (see "CDP and VMware," below).
FC CDP appliances
Stefan Knoerer, senior software and system architect at Siemens IT Solutions and Services in Munich, Germany, needed faster restores of a high-performance database. The Oracle database, consisting of more than 200 LUNs, resided on an EMC Symmetrix DMX-3. Each day, there were approximately 350GB worth of changes to the database, which ran at around 5,000 IOPS and used eight FC HBAs on each HP-UX server. Siemens was using EMC DMX-3's Business Continuance Volumes along with backup software for data protection, but it could still take up to 20 hours to restore a database when corruptions occurred.
To address this problem, Knoerer evaluated FC SAN-based CDP appliances, selecting EMC's Recover-Point CDP. After a proof of concept followed by a three-month trial, Knoerer implemented the CDP appliance in early 2007 and witnessed a big reduction in database recovery times. "RecoverPoint reduced our database recovery times to two to four hours, depending on database traffic," says Knoerer.
Although Siemens used EMC's PowerPath to split writes at the host level, EMC's RecoverPoint also takes advantage of new protocols in FC switches and FC director line cards that can split host writes at the fabric level. The EMC RecoverPoint CDP appliance logs on to a Brocade AP-7420B Multiprotocol Router FC switch or a Cisco Systems Inc. MDS 9000 family FC director with a Storage Services Module line card. EMC RecoverPoint requests that the service begin copying writes for a specific host's data stream, which is then sent to the RecoverPoint appliance over the FC connection.
Both of Symantec's backup programs--Backup Exec and Veritas NetBackup--offer CDP technologies that help users to minimize potential data loss by creating more frequent recovery points. Veritas NetBackup RealTime Protection (available in fall 2008) is a CDP offering designed for mission-critical data center applications with heavy workloads. It supports both Unix and Windows operating environments, and requires SAN-based storage.
Veritas NetBackup RealTime requires the latest version of NetBackup, Version 6.5.2, to unlock its full functionality. It's part of the Veritas NetBackup product line and uses NetBackup for policy creation and recovery processes. RealTime can also be deployed as a standalone product and is sold separately.
Backup Exec also offers continuous data protection technology (Backup Exec Continuous Protection Server) designed for Windows environments with moderate to light workloads, and has no specific storage requirements. Backup Exec's CDP capabilities are currently available.
A crash-consistent image is the default-recoverable database image that all block-based CDP products provide. Should a database corruption occur, storage administrators may recover the database to any past point-in-time. However, full database recoveries still depend on the database admin to replay the database transaction logs to complete the recovery to a state where the restored image is usable by the database.
Transaction-consistent database recoveries minimize the database admin's involvement in recoveries. To create points in the CDP journal that are identifiable as recoverable transaction-consistent database images, CDP products offer application-specific host agents for databases. The CDP host agent monitors the database for periods when it enters a transaction-consistent state and then inserts a bookmark into the CDP journal. When performing recoveries, the CDP software identifies and displays these bookmarks so admins can restore images that are immediately accessible.
To create a consistency group, an admin selects and aggregates the virtual CDP LUNs that mirror the LUNs on which the production database resides. Each consistency group has its own journal that tracks data changes to any of the LUNs belonging to that consistency group. A critical factor is placing the CDP consistency group journal on back-end disk that matches or exceeds the performance of production LUNs.
Critical to Siemens' implementation was the creation and configuration of three CDP consistency groups to keep the database consistent across more than 200 RecoverPoint CDP LUNs. Siemens' Knoerer matched his 200 production database LUNs with virtual RecoverPoint LUNs residing on EMC Clariion CX700 disk.
However, Knoerer placed his consistency group journal on virtual CDP LUNs that were mapped back to the DMX-3 because he needed the higher performance of the DMX-3 storage to keep pace with the large number of changes in the Oracle database.
The next frontier that CDP software needs to address, perhaps beginning in FC SAN-attached environments, is how to keep apps that run across multiple servers consistent. Eric Burgener, senior analyst and consultant at Hopkinton, MA-based Taneja Group, says there's a debate going on in the CDP community about the best way to create a consistent image across multiple servers.
FC SAN-attached CDP products will involve lengthier testing and configuration periods on a per-server basis. Admins will need to allocate new FC ports on FC SAN directors for the CDP appliance and new storage capacity to mirror the source server's production volumes; they'll also need to deploy new storage that matches the performance of the production database to keep the CDP journal.
In the end, reduced recovery time and backup software integration will be the deciding factors in product selection. FC SAN-attached CDP products offer almost immediate recoveries and allow applications to fail over and operate on virtual volumes presented by the CDP appliance with minimal or no application performance degradation. If simplified backup and recovery is your primary objective, backup products with integrated CDP promise to dramatically lower recovery time objectives and recovery point objectives.