Hardware-based products propelled deduplication into the mainstream, but now that most backup apps include dedupe, you'll have to carefully evaluate the options.
Data growth grabs most of today's IT headlines and many IT organizations believe data protection is one of the key contributors to the staggering data capacities that need to be managed. Why? Lots of copies are made by data protection processes -- at least once per day, but sometimes multiple times daily -- and kept locally for operational recovery. Copies of copies are also sent offsite for disaster recovery (DR) purposes. Most backup and replication solutions perform these processes inefficiently, making multiple copies of the same file despite only a small amount of the data within the file having been changed. Maintaining daily, weekly, monthly and yearly backup copies means that dozens of copies of the same data may be stored, and often for extended periods of time. It's this propagation of data that makes data deduplication a compelling technology for secondary storage environments. While the deduplication spotlight has been focused to date on hardware products that optimize storage capacity, the addition of dedupe capabilities in several backup apps could shift the focus in 2009.
As more organizations implement disk in the backup process to overcome the performance and reliability shortcomings of tape-based protection, data deduplication has emerged as a force to improve the economic
Dedupe approaches compared
Hardware vendors spearheaded dedupe adoption with powerful, purpose-built deduplication appliances that process backup data before or after it's written to disk. Benign to the existing backup environment, this hardware-based approach made deploying dedupe relatively easy. Research from the Enterprise Strategy Group has found that the ability to integrate with existing backup processes and overall ease of use are more important adoption factors to organizations than specific technical considerations, such as a deduplication ratio or the granularity of deduplication.
Seamless integration with existing data protection practices, as well as IT's historic resistance to change when it comes to backup software, meant that backup solution providers that could offer deduplication had a more difficult time getting mindshare in the data center. When EMC Corp.'s Avamar came to market touting a better, more efficient way to back up data, the company faced an obstacle that was hard to overcome: reluctance to walk away from existing backup applications. IT organizations could clearly understand the benefits, but weren't motivated to initiate a technology change that would have a ripple effect on the operational aspects -- people and process -- of the data protection environment. EMC Avamar has therefore had to take a more circuitous route to the data center, providing a bandwidth- and storage-optimized backup solution for remote and branch offices, as well as an efficient data protection alternative for server virtualization environments.
However, the integration of acquired deduplication products by EMC (Avamar) and Symantec Corp. (PureDisk) with NetWorker and Veritas NetBackup, respectively, as well as recent introductions of native dedupe by CA, CommVault and IBM Corp. have a lot of IT organizations wondering which is the best implementation of deduplication -- hardware or software? Bottom line: It's not a one-size-fits-all scenario.
This was first published in May 2009