Should I use data deduplication to manage unstructured data?

Should I use data deduplication to manage unstructured data?

Absolutely; there are two areas where data deduplication, or single-instance storage, can help. Deduplication is fairly common today in backups using a variety of appliances or software. Since this is backup, however, it doesn't address the immediate problem of unstructured data growth at the source. There are deduplication products that can help at the source, but they're not as popular today because the added workload in identifying similar blocks or byte sequences, or files, depending on the level of deduplication you choose. From a source or production storage perspective, deduplication may impact performance, making the technology less appealing. This is why we see a lot of deduplication deployed at the backup level. See the article Data Deduplication Explained for more information.

    Requires Free Membership to View

    When you register for SearchStorage.com, you’ll also receive targeted emails from my team of award-winning editorial writers. Our goal is to keep you informed on the hottest topics, the latest news and the biggest challenges you face as a storage professional today.

    Rich Castagna, Editorial Director

    By submitting your registration information to SearchStorage.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchStorage.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

Still, specific products, like email archiving tools, can help reduce the amount of data being stored, keeping just what you need from a policy perspective. The deduplication element of this approach allows you to shrink storage requirements. Deduplication is also appearing at the WAN level to reduce data volumes transferred between locations -- particularly when replicating data. Deduplication certainly isn't limited to unstructured data, but that's where deduplication can really shine.

Listen to the Unstructured data FAQ audiocast.

Go to the beginning of the Unstructured Data FAQ Guide.


This was first published in March 2007