Where does deduplication belong in backup? (Hot Spots)

Hardware-based deduplication

Hardware-based deduplication is less disruptive; that is, it's seamless to deploy because it's compatible with any backup software and can be implemented quickly and easily. It typically leverages powerful, purpose-built storage appliances to accommodate processing of the entire (non-deduplicated) backup load either pre- or post-ingestion. Hardware-based solutions also have the advantage of processing data streams from multiple backup applications.

There are a few trade-offs to consider. More data than may be necessary traverses the network between the source system and target device (creating unnecessary congestion), as deduplication happens at the end of the data path. Depending on the solution, scalability could be another drawback. Some vendors are limited to single-node systems, which can result in multiple islands of deduplication and points of management, as well as underutilization in capacity per silo. Data streamed to a single-node system is only compared with other data directed to the node.

The goal of many target-side deduplication vendors is to deduplicate across clustered nodes. Global dedupe allows backup data to be deduplicated against all other backup data, regardless of which head actually receives the data. This capability is seen more often in software-based and grid architecture approaches, but may also be supported for target deduplication systems that replicate in a hub-and-spoke fashion (with

Requires Free Membership to View

global deduplication occurring at the hub). Global deduplication can result in higher deduplication ratios -- as data is deduplicated within and across backup sources -- and greater economies of scale with respect to operational overhead and capital costs.

Cost is a factor

Enterprise Strategy Group research has found that organizations are just as likely to purchase and implement data deduplication technology from backup software vendors as they are from disk/appliance hardware vendors. The top considerations when evaluating and selecting a data deduplication provider are cost, ease of integration, performance, ease of use and scalability, with cost clearly outranking the others. Now that deduplication is becoming a mainstream feature integrated in backup software, it will be interesting to see if "bolt on" deduplication systems can maintain their premium price.

As with any new technology, it will be important for IT organizations to evaluate software- and hardware-based approaches vs. the requirements of the environment. Having a clear understanding of how deduplication works, especially in conjunction with other requirements such as performance, ease of use and offsite copy creation, should go a long way toward selecting and designing a solution that delivers maximum business, operational and financial benefits.

BIO: Lauren Whitehouse is an analyst focusing on backup and recovery software and replication solutions at Enterprise Strategy Group, Milford, Mass.

This was first published in March 2009

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: