Home > Storage Technology Tips > Data storage management > Is data deduplication right for your primary storage infrastructure?
Storage Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

Is data deduplication right for your primary storage infrastructure?


Rick Cook
Rating: -3.56- (out of 5)

Data deduplication is a firmly established technique for reducing storage demands on data backup, but a handful of vendors are now applying this technology to primary storage. However, the demands on primary storage are considerably different than those on data backup, so if you're undertaking a primary deduplication project, you'll need to learn about the different requirements and techniques.

In both primary storage and data backup, deduplication technology scans the data to be stored and replaces duplicate blocks or files with pointers to the previously stored blocks or files that have been duplicated.

In backup, data deduplication is highly space efficient, resulting in storage savings of as much as 20:1. But because primary storage offers fewer opportunities for deduplication, primary dedupe u...


BROWSE BY TAG
Data storage management,   Data Backup,   Data reduction and deduplication,   Data Storage Management,   Primary storage capacity optimization,   VIEW ALL TAGS

RELATED CONTENT
Data storage management
Cloud storage pricing: The cost of a hypothetical month of cloud data storage
Cloud storage pricing revealed: Hidden costs include data migration and access fees
Creating a data center migration plan
Top 10 enterprise data storage tips of 2009
Building a private storage cloud: Essential components
How to add solid-state storage to your enterprise data storage systems
Is cloud data storage right for your IT infrastructure?
Optimizing enterprise data storage capacity and performance to reduce your data footprint
Fail-in-place systems: Avoiding hard disk drive failures
Data storage resources needed to implement a virtual desktop infrastructure

Data reduction and deduplication
Backup and disaster recovery (DR) hardware finalists: 2009 Products of the Year
Creating a data center migration plan
An introduction to data compression
Primary storage data reduction advancing via data deduplication, compression
NetApp: Post-process deduplication limits performance hit in primary storage data deduplication
EMC Celerra: Primary storage data reduction through deduplication, compression
Storwize claims good data compression rates, no performance degradation on STN-6000 appliance
Primary storage data reduction: Data deduplication and compression tools
Gartner analyst on data deduplication for primary storage
Ocarina ECOsystem deconstructs before compression, deduplication for primary storage data reduction

Primary storage capacity optimization
Quantum brings data deduplication into StorNext management software
Improve storage utilization rates with storage optimization, capacity reduction techniques
Ford's storage ledger balances capacity decisions
Leverage existing network-attached storage and block storage for better data storage management
Green storage essentials: Addressing power, cooling and space issues
Performance metrics: Evaluating your data storage efficiency
Tools and techniques for reducing your enterprise data storage footprint
Tools for using your enterprise data storage resources more efficiently
Optimizing enterprise data storage capacity and performance to reduce your data footprint
Thin provisioning brings utilization and capacity benefits to data storage, but with a caveat

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
compression  (SearchStorage.com)
data deduplication  (SearchStorage.com)
delta differencing  (SearchStorage.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


sually doesn't produce the same kind of space savings. Rather than 20:1, primary dedupe is more likely to result in ratios of 2:1.

Before you start a primary deduplication project

If you're considering a primary deduplication project, it's important to determine what you intend to dedupe. You need to study your data and look for likely candidates, such as applications with data that rarely changes and transactional databases where you can't afford performance penalties. You should run tests to measure the performance impacts before you commit to deduping your primary storage.

Because each system is unique, you should carefully consider the effects of primary dedupe before applying it. The effectiveness of primary dedupe depends in large part on the characteristics of the system it's applied to, including:

  • The mix of applications

  • Usage patterns

  • The rate of change in the data

  • Processor power, storage configuration and network throughput

Latency is another issue that separates primary storage and data backup when it comes to dedupe.

Because every block or file has to be checked for duplication, data deduplication extracts a performance penalty and uses resources for checking data. Latency is more likely to affect users in primary storage than in backups. Therefore, there are a lot of primary dedupe products that emphasize performance, but this often comes at the expense of enterprise data storage efficiency.

Data deduplication options for primary storage

A number of companies, including NetApp Inc. and the recently acquired Data Domain Inc., offer options for primary dedupe. Other vendors offer dedupe capabilities combined with features such as in-line compression to reduce the footprint of data not suited for dedupe or to automatically identify deduplication opportunities in the data stream. Storwize Inc. offers real-time compression, while Ocarina Networks uses an extraction compression technique to indentify dedupe candidates.

Data deduplication is also becoming an increasingly popular option in virtualized systems because the multiple instances of the OS are highly redundant and seldom change. Other contents of C: drives on virtual machines (VMs) are also highly redundant and barely change.

Generally speaking, the closer an application or a data file approaches a WORM device, the more suited it is for deduplication. Therefore, CAD files and graphics files are also perfect candidates because of how little they change.

A classic example of where data deduplication is a poor choice is in a transactional database where data frequently changes. This can result in increased activity and places a heavy load on system resources. File sizes are often also too small, which can make it difficult to efficiently match standard block sizes.

Rate this Tip
To rate tips, you must be a member of SearchStorage.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Find Data Backup Analysis
TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts