Complete guide to backup deduplication
A comprehensive collection of articles, videos and more, hand-picked by our editors
For space reclamation and capacity optimization, deduplication is a powerful tool -- as long as you use the right method. In his Storage Decisions New York session, "Attack of the Killer Capacity," Taneja Group analyst Mike Matchett discussed the benefits of compression and deduplication, and how the two differ from each other.
"Not all deduplication is equal," Matchett said. "You can get different deduplication ratios, and it's worth testing out."
Inline deduplication reduces the number of copies of data before or while it is being written to a device. This has the benefit of allowing the backup to only have to support the final, necessary data, but can slow down the backup process.
Post-process dedupe, the alternative, does not eliminate the extraneous data until after it has been backed up. "So, if you're not aware of that, this can be done once you have the data landed," explained Matchett. "A scavenger process comes around behind and finds opportunities to deduplicate."
When it comes to compression, Matchett addressed the idea that compression and deduplication might be the "same thing." While he admitted that the two are technically similar, they perform different tasks. "But they work differently in practice, so compression looks at your file and makes it smaller and then you go and you can dedupe it," explained Matchett.
If you're looking for products that will aid capacity management, Matchett mentioned a number of vendors offering capacity management tools you might need, including ExtremeIO, Kaminario and RainStor. "You're going to do thin provisioning, dedupe and compression," said Matchett. "These are becoming sort of these basic check marks now for all kinds of storage that you're going to find."