Home > Storage Magazine > Features > Catching up with deduplication
EMAIL THIS
Storage Magazine

  CURRENT ISSUE  

  FEATURES  

  TOOLS, TRENDS & ANALYSIS  

  COLUMNS  

  ARCHIVES  

  SUBSCRIBE/RENEW  
 

Catching up with deduplication
by Jerome Wendt
Issue: Jun 2007
printer-friendly
< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   7  |   8  |   NEXT PAGE  >

EMC Avamar and Symantec Veritas NetBackup PureDisk take a slightly different approach to address the performance issue. They use agents that utilize computing resources on each client server to do the initial file hash. As part of this process, the agents communicate with the main backup server, which maintains a central database of the unique file hashes. As the Avamar or PureDisk agents on the servers hash the files, they check with the central server to see if the generated hash already exists. If the hash exists, the agent ignores the file; if it doesn't exist, it breaks the file into smaller segments and looks for new unique file segments to store. From that point, EMC Avamar and PureDisk deviate in their product implementation.

EMC Avamar allows server storage capacity to grow to approximately 1.5TB in size. Although Symantec Veritas NetBackup PureDisk servers can grow to manage nearly 4TB of PureDisk storage capacity, EMC Avamar uses segment sizes that are about one-fourth the size of PureDisk's. This allows it to better identify redundant data in files, asserts Jed Yueh, EMC Avamar's VP of product management. If users should need to grow in capacity and scale, EMC Avamar uses a redundant array of independent nodes (RAIN) clustering architecture. This allows organizations to add more server nodes into the RAIN cluster to increase server capacity and performance by striping the data across multiple nodes.

In a PureDisk environment, ...



a single server can manage 4TB of PureDisk storage and up to 100 million files which equates, according to Symantec, to a little more than 80TB of source data. Additional servers can be added to expand PureDisk's storage capacity or to handle larger number of files.

PureDisk manages file meta data outside of the file system using MetaBase Server and MetaBase Engines. As an environment grows, a storage manager uses PureDisk to add new instances of MetaBase Engines; because the MetaBase Server controls communication to all MetaBase Engines, expanding the deduplication environment is a relatively simple process. This separation of the file meta data from the file system allows PureDisk to improve search- and maintenance-related activities on the underlying storage system, grow to hundreds of terabytes and billions of files, and retain a single logical instance of deduplicated data across the enterprise.


Click here for a chart showing deduplicating disk libraries (PDF).


Early adopters
Early adopters of EMC Avamar and Symantec Veritas NetBackup PureDisk report minimal issues with installing backup software agents or server performance hits, but there are some specific circumstances that they monitor more carefully: the initial round of backups and the age of the server on which agents are deployed.

< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   7  |   8  |   NEXT PAGE  >





TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts