| The major backup software vendors are adding new ways to protect and manage data.
Data protection is changing rapidly: Point-in-time recoveries, faster legal discovery response times and near real-time disaster recoveries are becoming new requirements for corporate data protection. To address these and other needs, enterprise backup applications are adding support for continuous data protection (CDP), deduplication, ediscovery, single-instance storage (SIS) and VMware Inc.'s VMware Consolidated Backup (VCB) framework to become the focal point for not just integrated data protection but enterprise data management.
An integrated, comprehensive data protection suite provides the following benefits over standalone products:
How these new features integrate with backup software and how they're licensed will influence the degree to which companies can centrally manage them. Vendors are integrating these features in two ways: through the use of scripts so that backup software acts as a dashboard to report and manage these new features, and by re-architecting back-end backup software catalogs and data stores to handle the additional meta data and data these new features create (see "Data protection checklist," below).
- Aligns data protection with app requirements
- Centralizes policy creation and management
- Leverages CDP and host- or array-based replication for faster recovery options
- Reduces administration and licensing costs
|Data protection checklist
| Corporate compliance, ease of administration, lower costs, and heightened recovery time objectives (RTOs) and recovery point objectives (RPOs) are the new benchmarks by which companies should measure their data protection software. Here's a checklist of the key features companies should evaluate as vendors integrate their data protection software.
Centralized management console. Enterprise administrators must increasingly manage data that spans business units and geographic regions. Data management software must integrate with corporate security infrastructures to secure access to data, while giving administrators the appropriate permissions to protect data anywhere in the company.
Data reduction. Companies are keeping more data on disk and for longer periods of time. Data reduction reduces both cost and environmental concerns.
Information access, index and search. New laws are pushing IT and legal departments to work more closely together. Search engines must support a variety of legal search requirements.
Natively compiled 64-bit code. To manage and process the growing amount of data, the underlying backup software engine needs to work faster. Buying newer, faster 64-bit server hardware helps, but unless the backup software is natively compiled 64-bit code, it can't take full advantage of 64-bit hardware features.
Integrated support for the VMware Consolidated Backup (VCB) framework. VMware is becoming a ubiquitous technology in enterprise data centers, and backup software for the VCB framework is now a prerequisite. Backup software should do more than just back up VMDK files; it should understand them so it can provide granular restores at the virtual machine or file level.
More meta data
CommVault's Simpana 7.0 uses the same software catalog across its entire line of data protection products; users have immediate access to its Common Technology Engine (CTE) catalog regardless of which CommVault product is selected. When used in conjunction with its Galaxy Backup & Recovery software, the CommServe master server centralizes, coordinates and distributes policies across one or more of its media servers.
The CommServe database contains top-level information for the complete environment, while a secondary catalog tier is federated across the media server indexing structure. Using a distributed index, each media server catalogs and manages the data under its control and submits only a subset of that information to the master CommServe database catalog. This distributed database design prevents that catalog from becoming too large and slowing its performance. EMC Corp. NetWorker is similar in that it uses its own proprietary catalog, which can hold millions of objects and creates a federation of smaller catalogs divided into data zones that maintain information about the media servers under their control.
Symantec Corp. took steps to upgrade its Veritas NetBackup backup software catalog in 2005 because of the increased amount of meta data it needed to manage. Symantec changed Veritas NetBackup's underlying catalog from a flat-file to a Sybase relational database with its Veritas NetBackup 6.0 release; a series of maintenance packs released throughout 2006 addressed some of the issues created by this change in the underlying catalog database.
Veritas NetBackup catalog issues still exist depending on which version of NetBackup a company uses. On Windows Master Servers running Veritas NetBackup 6.0 MP5, if catalog compression is turned on and the image database becomes full, Veritas NetBackup may not be able to continue normal database activity and its services may stop. Veritas NetBackup 6.5.1 includes fixes for the catalog issues introduced by Veritas NetBackup 6.5, such as processing large numbers of images more efficiently and performing incremental hot catalog backups.
IBM Corp. is planning what it calls a "significant" database upgrade to a future release of its Tivoli Storage Manager (TSM) database (no release date has been provided). As part of the release, IBM plans to support more concurrent archive and backup operations, expand its maximum database catalog size of 1TB (vs. the current 530GB limitation), and allow online, automated database reorganizations (currently, database reorganizations can only be conducted offline).
CA didn't change its catalog database in its February 2008 ARCserve Backup r12 release. However, to enhance CA ARCserve's search capabilities, its complementary CA XOsoft CDP software will now use Microsoft SQL Server as its back-end catalog database so ARCserve can take advantage of SQL Server's built-in content indexing functions.
Vendors are adding new data-reduction technologies to their core backup products. EMC introduced the Avamar deduplication feature in its NetWorker 7.4 backup clients, while Symantec created a new standard tier for its client licenses so admins can install Veritas NetBackup 6.5 or the Veritas NetBackup PureDisk 6.2 agent on a client server using the same license. EMC's Avamar and Symantec's Veritas NetBackup PureDisk agents are similar in that they both do data deduplication at the block level. The main difference for now is that EMC can centrally manage its Avamar clients using its NetWorker 7.4 management console, while Veritas NetBackup users must use a separate management interface to manage Veritas NetBackup and Veritas NetBackup PureDisk clients.
Longer term, Symantec's Veritas NetBackup appears better positioned to extend deduplication to more enterprise servers. In early 2008, Symantec expects some of its hardware partners such as Data Domain Inc., Network Appliance (NetApp) Inc. and others to offer plug-ins for the NetBackup Open Storage API, giving users more control over the data residing on these apps. Using the Open Storage API, NetBackup can exchange information with these storage devices and record activities such as the expiration of remote data.
CommVault's Simpana 7.0 uses SIS for its backup and archive data stores. Brian Brockway, CommVault's senior director of product management, says using SIS provides "about 80% of the data-reduction benefits of deduplication, but 110% of the benefits." Because files are stored in their original format in the data store, they don't need to be reconstructed from chunks of data during recoveries; data stores may also be used for other purposes such as legal discoveries.
Brockway concedes that for companies that have large databases or large numbers of files with only small differences between each file, deduplication appliances can still provide significant benefits. "Deduplicating data on back-end backup appliances still resonates with customers, and to that purpose we designed our SIS stores to align with any block-level deduplication storage device," he says.
CA elected to pursue relationships with existing deduplication backup appliance vendors for its ARCserve r12 backup software. CA is exposing its backup software format to deduplication backup appliance providers so they can take steps to read the data inside CA ARCserve backup data streams. ExaGrid Systems Inc.'s deduplicating backup appliances use CA ARCserve's backup data format to understand the layout of the ARCserve backup job, read its roadmaps and signatures, and identify the type of data and size of data segments to deduplicate the data.
CDP is also emerging as a must-have feature in an integrated data protection suite. CommVault is the furthest along in its integration of Continuous Data Replicator (CDR) and Galaxy Backup & Recovery software, as these two products share CommVault's CTE for storing meta data. However, CDR performs only near-CDP as it takes application-consistent snapshots of the data at regular, frequent intervals as opposed to journaling all of the changes to the data.
EMC NetWorker 7.4 and Symantec Veritas NetBackup 6.5 now include updates to their backup agents to support their respective RecoverPoint and Continuous Data Protection/Replication (CDP/R) Fibre Channel (FC) SAN-based CDP software. For instance, Symantec's CDP/R host agent uses the Veritas NetBackup 6.5 client's ability to quiesce databases so it can insert a marker into the CDP/R journal during these pauses to create consistent recovery points. EMC NetWorker offers a PowerSnap module for RecoverPoint that tracks all of the CDP snapshots in its catalog so users have a centralized view of their recovery points.
Encrypting data sent offsite also remains at the top of users' priority lists. While all backup software can encrypt data at the client level, it's now possible to offload encryption from the client and manage it elsewhere in the backup process.
Veritas NetBackup 6.5's new Media Server Encryption Option (MSEO) can encrypt data on the media server. Data is first backed up to disk and then encrypted by the media server when it's moved from disk to tape. This technique offloads the performance hit typically associated with encryption from the client to the media server; copying from disk to tape normally occurs during off-hours when the media server is idle.
EMC has no short-term plans to introduce media server encryption in NetWorker. Rob Emsley, senior director of software product marketing, says customers prefer to put encryption outside of the domain of backup software, and that encryption is better suited to be handled by a dedicated appliance or FC switch.
CA ARCserve added support for offloading encryption to its media servers in its r12 release. CA ARCserve passes user-generated keys or keys supplied by an external key manager to the LTO-4 tape drive, which it uses to encrypt the data. Although ARCserve currently supports only LTO-4 tape drives, CA says it's technically possible to extend this support to other tape devices.
Click here for a comparision chart of
Data protection software suites (PDF).
Media server load balancing
The need for backup software to do media server load balancing and failover is driven in part by the changing nature of backup loads and the need to continue doing backups when a media server is overloaded or fails. Veritas NetBackup 6.5's new Media Server Load Balancing feature creates a virtual pool of media servers. When a backup job starts, it's assigned to the least busy media server based on its CPU, memory, IO throughput and the number of active jobs. If a media server fails, new backup jobs are sent only to active media servers, while in-progress backup jobs are marked and automatically continue so another available media server doesn't need to restart the entire job.
Integrated support for the VMware VCB framework is also emerging as an important feature for integrated data protection. Although backup software agents can run individual virtual machines (VMs), running multiple backup jobs at the same time on the same VMware server consumes shared server CPU, memory and network resources, and creates system bottlenecks. The VCB framework supports a full backup of all of the data on the VMware server by backing up its Virtual Machine Disk Format (VMDK) file. However, VMDK file recoveries are an all-or-nothing proposition as they only restore the entire VMware and all of its VMs to a specific point-in-time (see "How backup vendors cope with virtual servers," below).
|How backup vendors cope with virtual servers
| Backup software vendors are taking advantage of the new file-level recovery feature supported by the VMware Consolidated Backup (VCB) framework. However, vendors are adding some specific options unique to their backup software to minimize the overhead associated with backing up VMware ESX servers.
EMC Corp. Avamar takes advantage of the VCB feature that allows a proxy server to mount a virtual machine (VM) on another VMware Infrastructure server running on a SAN. It provides the same deduplication benefits as when it runs on a local VM, while offloading the CPU utilization and backup times to the proxy backup server.
CommVault Galaxy Backup & Recovery provides two different recovery options using VCB: a VCB-proxy, full-image VM backup for disaster recovery that only works on SAN-based VMware ESX servers; and a VCB-proxy, file-level backup that does file-level restores for Windows-based clients.
IBM Corp. Tivoli Storage Manager 5.5 uses the VCB proxy server feature to coordinate the movement of VM data from the VCB proxy server to tape devices.
Symantec Corp. Veritas NetBackup for VMware, part of NetBackup 6.5's new Snapshot Client, enhances the granularity of recoveries. NetBackup 6.5 only does a single backup pass of the VMDK file, but can restore individual files or the full image (VMDK). Other backup software products typically require two passes to do a backup so they can do either file- or image-level restores.
Vendors are changing--mostly for the better--how they license their backup software features. For example, Symantec recently changed its licensing from a capacity-based model to a three-tier licensing model with its release of Veritas NetBackup 6.5. Under this licensing option, firms have access to all of Veritas NetBackup's data protection features for all of their servers based on what tier they subscribe to. Currently, Symantec offers the following tiering options:
However, users shouldn't assume that Symantec's or any vendor's licensing model is etched in stone, as vendors are still debating internally what features belong where and even which licensing model they should use. Symantec is on the fence as to whether to include its CDP option in its Enterprise or Premium tier. EMC is moving licensing for its Avamar and NetWorker product lines toward a capacity-based model and not charging customers for NetWorker's integration with Avamar; however, if users find capacity-based pricing unsuitable, EMC is willing to talk.
- Standard Infrastructure tier. Provides support for physical tape, including library-based tape drives, the shared storage option for FC-SAN tape drives, Vault and the StorageTek Virtualization Option.
- Enterprise Infrastructure tier. Includes everything in the Standard Infrastructure tier plus disk-based backup, NDMP support, Open Storage Option, Advanced Disk Option and the VTL Option.
- Premium Infrastructure for PureDisk tier. Includes everything in the Enterprise Infrastructure tier plus support for NetBackup PureDisk deduplication.
"We have worked some specific customer deals on a one-off basis," says EMC's Emsley.
CA also simplified its licensing model, reducing its licensing options to eight, although it's still following the more traditional server-based licensing model for now. For its ARCserve product line, CA licenses ARCserve and XOsoft product lines based on the total number of application, database, file and messaging servers they'll protect; servers require a separate ARCserve and XOsoft license if a user wishes to run both of these software products on a specific server.
Archiving and legal discovery
Backup software vendors are at different stages in integrating their data protection and archiving with search functions. IBM's System Storage Archive Manager (SSAM) is ahead of the pack and an indicator of where most enterprise integrated data protection software will eventually end up. SSAM is a modified version of TSM that makes the deletion of data extremely difficult because it won't permit data deletion on storage managed by the SSAM server before the data's scheduled expiration date.
CommVault adapted the architecture of its backup data store to allow most third-party search engines to access its repository. CommVault currently utilizes the FAST search engine to search the meta data stored in its CommServe database catalog, but has left the door open in its database catalog architecture so it can work with other third-party search engines should corporate search requirements change.
Backup software is evolving to where it can coordinate and manage other data protection features. EMC NetWorker and Symantec Veritas NetBackup currently have the broadest portfolio of data protection options, and recent changes to their data protection licensing should make their software more palatable. But the complexity associated with implementing and managing Veritas NetBackup's and NetWorker's multiple features make simpler data protection suites like CA ARCserve and CommVault Simpana more appealing to small- and medium-sized enterprises.
This Content Component encountered an error