Home > Storage Magazine > Features > The skinny on data deduplication
EMAIL THIS LICENSING & REPRINTS
Storage Magazine

  CURRENT ISSUE  

  FEATURES  

  TOOLS, TRENDS & ANALYSIS  

  COLUMNS  

  ARCHIVES  

  SUBSCRIBE/RENEW  
 

The skinny on data deduplication
by W. Curtis Preston
Issue: Jan 2007
printer-friendly
licensing & reprints
< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   NEXT PAGE  >

Data deduplication products drastically cut the amount of data you need to back up, but the way these systems reduce and store data varies.


Data deduplication changes all the rules in secondary storage. Most notably, it belies the rules that say every gigabyte of primary storage is represented by 10GB of backups, and the canard that tape is cheaper than disk.

There's been a flurry of debate about deduplication--both for and against--that has generated confusion, fear, uncertainty, doubt and misconceptions about the technology. Simply put, deduplication technologies identify and eliminate redundant data, significantly reducing the amount of disk needed to store the deduped data. Though various deduplication systems eliminate redundant data differently, all of the approaches look at the data on a subfile (block) level to determine if the system has seen the data before. If it hasn't, it stores it. If it has seen the data before, it ensures that it's stored only once and all other references to that data will just be pointers.

For example, a deduplication system would store the following data only one time:

  • The same file backed up from five different servers
  • Five percent of a weekly full backup if 95% of it was duplicate blocks of data stored last week
  • A daily full backup of a database that doesn't support incremental backups (most of it would be duplicate blocks from the day before)
  • Incremental backups of files that change constantly, such as a spreadsheet that's updated every day
Perhaps the biggest benefit deduplication brings to the table is the ability to have onsite and offsite backups without touching a single tape. A deduplicating virtual tape library (VTL) stores only the new, unique blocks from each night's backups. Those new, unique blocks could then be easily replicated to a second VTL residing outside the main data center; replication becomes more practical when you're replicating only new, unique blocks.

< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   NEXT PAGE  >





TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningWebcastsWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2000 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts