|
Data deduplication products can dramatically lower capacity requirements, but picking the best one for your needs can be tricky.
Exaggerated claims, rapidly changing technology and persistent myths make navigating the deduplication landscape treacherous. But the rewards of a successful dedupe installation are indisputable.
"We're seeing the growing popularity of secondary storage and archival systems with single-instance storage," says Lauren Whitehouse, analyst at Enterprise Strategy Group (ESG), Milford, MA. "A couple of deduplication products have even appeared for use with primary storage."
The technology is maturing rapidly. "We looked at deduplication two years ago and it wasn't ready," says John Wunder, director of IT at Milpitas, CA-based Magnum Semiconductor, which makes chips for media processing. Recently, Wunder pulled together a deduplication process by combining pieces from Diligent Technologies Corp. (deduplication engine), Symantec Corp. Veritas NetBackup and Quatrio (servers and storage).
Assembling the right pieces requires a clear understanding of the different dedupe technologies, a thorough testing of products prior to production, and keeping up with major product changes such as the introduction of hybrid deduplication (see "Dedupe alternatives," below) and the emergence of global deduplication.
Dedupe alternatives
Until recently, deduplication was performed either in-line or post-processing. Now vendors are blurring those boundaries.
- FalconStor Software Corp.
offers what it calls a hybrid model, in which it begins the post-process deduping of a backup job on a series of tapes without waiting for the entire backup process to be completed, thereby speeding the post-processing effort.
- Quantum Corp.
offers what it calls adaptive dedup-lication, which starts as in-line processing with the data being deduped as it's written. Then it adds a buffer that can increase dynamically as the data input volume outpaces the processing. It dedupes the data in the buffer in post-processing style.
...
To continue reading for free, register below or login
To read more you must become a member of SearchStorage.com

"Global deduplication is the process of fanning in multiple sources of data and performing deduplication across those sources," says ESG's Whitehouse. Currently, each appliance maintains its own index of duplicate data. Global deduplication requires a way to share those indexes across appliances (see "Global deduplication," below).
[IMAGE] [IMAGE] [IMAGE] [IMAGE]
[IMAGE] [IMAGE] [IMAGE] [IMAGE]
[IMAGE] [IMAGE] Global deduplication
[IMAGE]
[IMAGE] [IMAGE] [IMAGE] [IMAGE]
[IMAGE]
[IMAGE] [IMAGE] [IMAGE] [IMAGE]
"Global deduplication is the process of fanning in multiple sources of data and performing deduplication across those sources," says Lauren Whitehouse, analyst at Enterprise Strategy Group (ESG), Milford, MA. Global dedupe generally results in higher ratios and allows you to scale input/output. The global dedupe process differs when you're deduping on the target side or the source side, notes Whitehouse.
- Target side:
Replicate indexes of multiple silos to a central, larger silo to produce a consolidated index that ensures only unique files/segments are transported.
- Source side:
Fan in indexes from remote offices/ branch offices (ROBOs) and dedupe to create a central, consolidated index repository.
[IMAGE]
[IMAGE] [IMAGE] [IMAGE] [IMAGE]
[IMAGE]
[IMAGE]
|
 |
|