Home > Storage Technology Tips > Data storage management > Is data deduplication right for your primary storage infrastructure?
Storage Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATA STORAGE MANAGEMENT

Is data deduplication right for your primary storage infrastructure?


Rick Cook
10.14.2009
Rating: -4.00- (out of 5)


Storage technology learning materials
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Data deduplication is a firmly established technique for reducing storage demands on data backup, but a handful of vendors are now applying this technology to primary storage. However, the demands on primary storage are considerably different than those on data backup, so if you're undertaking a primary deduplication project, you'll need to learn about the different requirements and techniques.

In both primary storage and data backup, deduplication technology scans the data to be stored and replaces duplicate blocks or files with pointers to the previously stored blocks or files that have been duplicated.

In backup, data deduplication is highly space efficient, resulting in storage savings of as much as 20:1. But because primary storage offers fewer opportunities for deduplication, primary dedupe usually doesn't produce the same kind of space savings. Rather than 20:1, primary dedupe is more likely to result in ratios of 2:1.

Before you start a primary deduplication project

If you're considering a primary deduplication project, it's important to determine what you intend to dedupe. You need to study your data and look for likely candidates, such as applications with data that rarely changes and transactional databases where you can't afford performance penalties. You should run tests to measure the performance impacts before you commit to deduping your primary storage.

Because each system is unique, you should carefully consider the effects of primary dedupe before applying it. The effectiveness of primary dedupe depends in large part on the characteristics of the system it's applied to, including:

  • The mix of applications

  • Usage patterns

  • The rate of change in the data

  • Processor power, storage configuration and network throughput

Latency is ...



another issue that separates primary storage and data backup when it comes to dedupe.

Because every block or file has to be checked for duplication, data deduplication extracts a performance penalty and uses resources for checking data. Latency is more likely to affect users in primary storage than in backups. Therefore, there are a lot of primary dedupe products that emphasize performance, but this often comes at the expense of enterprise data storage efficiency.

Data deduplication options for primary storage

A number of companies, including NetApp Inc. and the recently acquired Data Domain Inc., offer options for primary dedupe. Other vendors offer dedupe capabilities combined with features such as in-line compression to reduce the footprint of data not suited for dedupe or to automatically identify deduplication opportunities in the data stream. Storwize Inc. offers real-time compression, while Ocarina Networks uses an extraction compression technique to indentify dedupe candidates.

Data deduplication is also becoming an increasingly popular option in virtualized systems because the multiple instances of the OS are highly redundant and seldom change. Other contents of C: drives on virtual machines (VMs) are also highly redundant and barely change.

Generally speaking, the closer an application or a data file approaches a WORM device, the more suited it is for deduplication. Therefore, CAD files and graphics files are also perfect candidates because of how little they change.

A classic example of where data deduplication is a poor choice is in a transactional database where data frequently changes. This can result in increased activity and places a heavy load on system resources. File sizes are often also too small, which can make it difficult to efficiently match standard block sizes.

Rate this Tip
To rate tips, you must be a member of SearchStorage.com.
Register now to start rating these tips. Log in if you are already a member.




BROWSE BY TAG
Data storage management,   Data Backup,   Data reduction and deduplication,   Data Storage Management,   Primary storage capacity optimization,   VIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Data storage management
Is cloud data storage right for your IT infrastructure?
Optimizing enterprise data storage capacity and performance to reduce your data footprint
Fail-in-place systems: Avoiding hard disk drive failures
Data storage resources needed to implement a virtual desktop infrastructure
Storage encryption essentials
Addressing storage performance bottlenecks in enterprise data storage
Data archiving: Three key elements
Archiving data to cloud storage: How to choose the right cloud storage provider
How to buy a blade server
Tips for an effective data deduplication implementation

Data reduction and deduplication
Tools and techniques for reducing your enterprise data storage footprint
Backup in a snap: A guide to snapshot technologies
Tips for an effective data deduplication implementation
EMC completes acquisition of Data Domain; fate of data deduplication partner Quantum unknown
EMC wraps up data deduplication vendor Data Domain; what's next for EMC, NetApp?
EMC acquires Data Domain for $2.1B after NetApp drops bid
EMC raises its acquisition offer for data deduplication vendor Data Domain to $2.1B
Choosing a storage system for data archiving
Storage Decisions Chicago 2009 Session Downloads
Storage Decisions Session Downloads: Backup Technologies Track (Chicago 2009)

Primary storage capacity optimization
Performance metrics: Evaluating your data storage efficiency
Tools and techniques for reducing your enterprise data storage footprint
Tools for using your enterprise data storage resources more efficiently
Optimizing enterprise data storage capacity and performance to reduce your data footprint
Thin provisioning brings utilization and capacity benefits to data storage, but with a caveat
Improving storage utilization with thin provisioning
Managing capacity planning with thin provisioning
3PAR fattens its thin provisioning arsenal
Storage virtualization essentials: Increasing flexibility and utilization with virtualization
How to select a storage resource management (SRM) tool

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data deduplication  (SearchStorage.com)
delta differencing  (SearchStorage.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Find Data Backup Analysis
TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts