This article can also be found in the Premium Editorial Download "Storage magazine: What you need to know about data dedupe tools for backup."
Download it now to read this article plus other related content.
Deduplication’s growing pains
As deduplication technology has matured, users have experienced most of the growing pains. Growing data volumes that tax backup and recovery have been a catalyst for performance and scale improvements, and have shifted attention to scale-out architectures for deduplication solutions. And replacing tape devices at remote and branch offices created requirements for optimized site-to-site replication, as well as a way to track those duplicate copies in the backup catalog.
In its most recent Data Protection Trends research report, ESG surveyed end users regarding their deduplication selection criteria and cost was the top purchase consideration. Some of the issues affecting cost include the following:
- Some backup software vendors add deduplication as a no-cost feature (CA and IBM TSM), while others charge for it.
- There are hidden costs, such as the added fee to enable replication between deduplication systems. And the recovery site has to be a duplicate (or nearly so) of the system at the primary location, which can double fees. There are exceptions, such as Symantec 5000 Series appliances, which include device-to-device replication at no charge. Symantec also licenses its product based on the front-end capacity of the data being protected vs. the back-end capacity of the data being stored, so replicated copies don’t incur additional costs.
- deduplication system vendors bundle their storage hardware with the deduplication software, so refreshing the hardware platform means the software is repurchased. Again, Symantec takes a different approach, licensing software and hardware separately.
Users drive new dedupe developments
In addition to Arkeia’s progressive deduplication approach, other developments have been pushing the dedupe envelope. CommVault’s ability to deduplicate on physical tape media is one such example. In spite of the initial hype regarding disk-only data protection and the potential to eliminate tape, for most companies the reality is that tape is still an obvious, low-cost choice for long-term data retention. Dedupe has been considered only a disk-centric process due to the need for the deduplication index and all unique data to be available and accessible to “rehydrate” what’s stored. That means when deduplicated data is copied or moved from the deduplication store to tape media, it must be reconstituted, reversing all the benefits of data reduction. But CommVault’s Simpana software enables archival copies of deduplicated data without rehydration, requiring less tape media. Importantly, data can be recovered from tape media without having to first recover the entire tape to disk.
This was first published in August 2011