| The major backup software vendors are adding new ways to protect and manage data.
An integrated, comprehensive data protection suite provides the following benefits over standalone products:
More meta data
The CommServe database contains top-level information for the complete environment, while a secondary catalog tier is federated across the media server indexing structure. Using a distributed index, each media server catalogs and manages the data under its control and submits only a subset of that information to the master CommServe database catalog. This distributed database design prevents that catalog from becoming too large and slowing its performance. EMC Corp. NetWorker is similar in that it uses its own proprietary catalog, which can hold millions of objects and creates a federation of smaller catalogs divided into data zones that maintain information about the media servers under their control.
Symantec Corp. took steps to upgrade its Veritas NetBackup backup software catalog in 2005 because of the increased amount of meta data it needed to manage. Symantec changed Veritas NetBackup's underlying catalog from a flat-file to a Sybase relational database with its Veritas NetBackup 6.0 release; a series of maintenance packs released throughout 2006 addressed some of the issues created by this change in the underlying catalog database.
Veritas NetBackup catalog issues still exist depending on which version of NetBackup a company uses. On Windows Master Servers running Veritas NetBackup 6.0 MP5, if catalog compression is turned on and the image database becomes full, Veritas NetBackup may not be able to continue normal database activity and its services may stop. Veritas NetBackup 6.5.1 includes fixes for the catalog issues introduced by Veritas NetBackup 6.5, such as processing large numbers of images more efficiently and performing incremental hot catalog backups.
IBM Corp. is planning what it calls a "significant" database upgrade to a future release of its Tivoli Storage Manager (TSM) database (no release date has been provided). As part of the release, IBM plans to support more concurrent archive and backup operations, expand its maximum database catalog size of 1TB (vs. the current 530GB limitation), and allow online, automated database reorganizations (currently, database reorganizations can only be conducted offline).
CA didn't change its catalog database in its February 2008 ARCserve Backup r12 release. However, to enhance CA ARCserve's search capabilities, its complementary CA XOsoft CDP software will now use Microsoft SQL Server as its back-end catalog database so ARCserve can take advantage of SQL Server's built-in content indexing functions.
Longer term, Symantec's Veritas NetBackup appears better positioned to extend deduplication to more enterprise servers. In early 2008, Symantec expects some of its hardware partners such as Data Domain Inc., Network Appliance (NetApp) Inc. and others to offer plug-ins for the NetBackup Open Storage API, giving users more control over the data residing on these apps. Using the Open Storage API, NetBackup can exchange information with these storage devices and record activities such as the expiration of remote data.
CommVault's Simpana 7.0 uses SIS for its backup and archive data stores. Brian Brockway, CommVault's senior director of product management, says using SIS provides "about 80% of the data-reduction benefits of deduplication, but 110% of the benefits." Because files are stored in their original format in the data store, they don't need to be reconstructed from chunks of data during recoveries; data stores may also be used for other purposes such as legal discoveries.
Brockway concedes that for companies that have large databases or large numbers of files with only small differences between each file, deduplication appliances can still provide significant benefits. "Deduplicating data on back-end backup appliances still resonates with customers, and to that purpose we designed our SIS stores to align with any block-level deduplication storage device," he says.
CA elected to pursue relationships with existing deduplication backup appliance vendors for its ARCserve r12 backup software. CA is exposing its backup software format to deduplication backup appliance providers so they can take steps to read the data inside CA ARCserve backup data streams. ExaGrid Systems Inc.'s deduplicating backup appliances use CA ARCserve's backup data format to understand the layout of the ARCserve backup job, read its roadmaps and signatures, and identify the type of data and size of data segments to deduplicate the data.
EMC NetWorker 7.4 and Symantec Veritas NetBackup 6.5 now include updates to their backup agents to support their respective RecoverPoint and Continuous Data Protection/Replication (CDP/R) Fibre Channel (FC) SAN-based CDP software. For instance, Symantec's CDP/R host agent uses the Veritas NetBackup 6.5 client's ability to quiesce databases so it can insert a marker into the CDP/R journal during these pauses to create consistent recovery points. EMC NetWorker offers a PowerSnap module for RecoverPoint that tracks all of the CDP snapshots in its catalog so users have a centralized view of their recovery points.
Encrypting data sent offsite also remains at the top of users' priority lists. While all backup software can encrypt data at the client level, it's now possible to offload encryption from the client and manage it elsewhere in the backup process.
Veritas NetBackup 6.5's new Media Server Encryption Option (MSEO) can encrypt data on the media server. Data is first backed up to disk and then encrypted by the media server when it's moved from disk to tape. This technique offloads the performance hit typically associated with encryption from the client to the media server; copying from disk to tape normally occurs during off-hours when the media server is idle.
EMC has no short-term plans to introduce media server encryption in NetWorker. Rob Emsley, senior director of software product marketing, says customers prefer to put encryption outside of the domain of backup software, and that encryption is better suited to be handled by a dedicated appliance or FC switch.
CA ARCserve added support for offloading encryption to its media servers in its r12 release. CA ARCserve passes user-generated keys or keys supplied by an external key manager to the LTO-4 tape drive, which it uses to encrypt the data. Although ARCserve currently supports only LTO-4 tape drives, CA says it's technically possible to extend this support to other tape devices.
Media server load balancing
Integrated support for the VMware VCB framework is also emerging as an important feature for integrated data protection. Although backup software agents can run individual virtual machines (VMs), running multiple backup jobs at the same time on the same VMware server consumes shared server CPU, memory and network resources, and creates system bottlenecks. The VCB framework supports a full backup of all of the data on the VMware server by backing up its Virtual Machine Disk Format (VMDK) file. However, VMDK file recoveries are an all-or-nothing proposition as they only restore the entire VMware and all of its VMs to a specific point-in-time (see "How backup vendors cope with virtual servers," below).
"We have worked some specific customer deals on a one-off basis," says EMC's Emsley.
CA also simplified its licensing model, reducing its licensing options to eight, although it's still following the more traditional server-based licensing model for now. For its ARCserve product line, CA licenses ARCserve and XOsoft product lines based on the total number of application, database, file and messaging servers they'll protect; servers require a separate ARCserve and XOsoft license if a user wishes to run both of these software products on a specific server.
Archiving and legal discovery
CommVault adapted the architecture of its backup data store to allow most third-party search engines to access its repository. CommVault currently utilizes the FAST search engine to search the meta data stored in its CommServe database catalog, but has left the door open in its database catalog architecture so it can work with other third-party search engines should corporate search requirements change.
Backup software is evolving to where it can coordinate and manage other data protection features. EMC NetWorker and Symantec Veritas NetBackup currently have the broadest portfolio of data protection options, and recent changes to their data protection licensing should make their software more palatable. But the complexity associated with implementing and managing Veritas NetBackup's and NetWorker's multiple features make simpler data protection suites like CA ARCserve and CommVault Simpana more appealing to small- and medium-sized enterprises.