Hardware-based deduplication is less disruptive; that is, it's seamless to deploy because it's compatible with any backup software and can be implemented quickly and easily. It typically leverages powerful, purpose-built storage appliances to accommodate processing of the entire (non-deduplicated) backup load either pre- or post-ingestion. Hardware-based solutions also have the advantage of processing data streams from multiple backup applications.
There are a few trade-offs to consider. More data than may be necessary traverses the network between the source system and target device (creating unnecessary congestion), as deduplication happens at the end of the data path. Depending on the solution, scalability could be another drawback. Some vendors are limited to single-node systems, which can result in multiple islands of deduplication and points of management, as well as underutilization in capacity per silo. Data streamed to a single-node system is only compared with other data directed to the node.
The goal of many target-side deduplication vendors is to deduplicate across clustered nodes. Global dedupe allows backup data to be deduplicated against all other backup data, regardless of which head actually receives the data. This capability is seen more often in software-based and grid architecture approaches, but may also be supported for target deduplication systems that replicate in a hub-and-spoke fashion (with
Cost is a factor
Enterprise Strategy Group research has found that organizations are just as likely to purchase and implement data deduplication technology from backup software vendors as they are from disk/appliance hardware vendors. The top considerations when evaluating and selecting a data deduplication provider are cost, ease of integration, performance, ease of use and scalability, with cost clearly outranking the others. Now that deduplication is becoming a mainstream feature integrated in backup software, it will be interesting to see if "bolt on" deduplication systems can maintain their premium price.
As with any new technology, it will be important for IT organizations to evaluate software- and hardware-based approaches vs. the requirements of the environment. Having a clear understanding of how deduplication works, especially in conjunction with other requirements such as performance, ease of use and offsite copy creation, should go a long way toward selecting and designing a solution that delivers maximum business, operational and financial benefits.
BIO: Lauren Whitehouse is an analyst focusing on backup and recovery software and replication solutions at Enterprise Strategy Group, Milford, Mass.
This was first published in March 2009