This article can also be found in the Premium Editorial Download "Storage magazine: Surprise winner: BlueArc earns top NAS quality award honors."
Download it now to read this article plus other related content.
EMC Avamar and Symantec Veritas NetBackup PureDisk take a slightly different approach to address the performance issue. They use agents that utilize computing resources on each client server to do the initial file hash. As part of this process, the agents communicate with the main backup server, which maintains a central database of the unique file hashes. As the Avamar or PureDisk agents on the servers hash the files, they check with the central server to see if the generated hash already exists. If the hash exists, the agent ignores the file; if it doesn't exist, it breaks the file into smaller segments and looks for new unique file segments to store. From that point, EMC Avamar and PureDisk deviate in their product implementation.
EMC Avamar allows server storage capacity to grow to approximately 1.5TB in size. Although Symantec Veritas NetBackup PureDisk servers can grow to manage nearly 4TB of PureDisk storage capacity, EMC Avamar uses segment sizes that are about one-fourth the size of PureDisk's. This allows it to better identify redundant data in files, asserts Jed Yueh, EMC Avamar's VP of product management. If users should need to grow in capacity and scale, EMC Avamar uses a redundant array of independent nodes (RAIN) clustering architecture. This allows organizations to add more server nodes into the RAIN cluster to increase server capacity and performance by striping the data across multiple nodes.
In a PureDisk environment, a single server
PureDisk manages file meta data outside of the file system using MetaBase Server and MetaBase Engines. As an environment grows, a storage manager uses PureDisk to add new instances of MetaBase Engines; because the MetaBase Server controls communication to all MetaBase Engines, expanding the deduplication environment is a relatively simple process. This separation of the file meta data from the file system allows PureDisk to improve search- and maintenance-related activities on the underlying storage system, grow to hundreds of terabytes and billions of files, and retain a single logical instance of deduplicated data across the enterprise.
Early adopters of EMC Avamar and Symantec Veritas NetBackup PureDisk report minimal issues with installing backup software agents or server performance hits, but there are some specific circumstances that they monitor more carefully: the initial round of backups and the age of the server on which agents are deployed.
This was first published in June 2007