This article can also be found in the Premium Editorial Download "Storage magazine: Using two midrange backup apps at once."
Download it now to read this article plus other related content.
While virtual tape library (VTL) vendors have been scrambling to add data deduplication to their products in recent months, the technology is spreading to archiving, replication and even primary storage.
A few examples: NetApp claims thousands of its customers have licensed its dedupe for primary storage, Data Domain is moving to bring dedupe to secondary and archiving applications, and switch vendor Brocade plans to get into the dedupe act with a fabric-based replication device.
NetApp has gone against the grain with dedupe. It's the first storage vendor to offer deduplication for primary data and one of the last to put it in its VTL. According to Chris Cummings, NetApp's senior director of data protection solutions, 2,500 customers and 10,000 systems activated NetApp's free dedupe utility in its Data Ontap OS as of the end of July, and most are using it for primary storage.
Greg Stazyk, systems coordinator at the Michael Smith Genome Sciences Centre (GSC) in Vancouver, BC, says he's been using NetApp's dedupe for primary data on his NetApp FAS6070 system since last year. He says his biggest fear proved to be unfounded, and he's happy with the results. "I haven't seen performance issues, which was one of the concerns I had," he says.
Stazyk gets the most benefit from deduping data sets such as home directories and virtual machine volumes. "We use deduplication
| on a couple of different types of data sets. Some of the data sets are very high turnover, and we don't get much benefit," says Stazyk. "But we do have data sets that are static and we get good space savings there. On average, we get about 17% to 20% disk savings when we've applied it to static data sets."
Others question the value of deduping primary data. Data Domain CEO and president Frank Slootman argues that primary data doesn't live long enough or take up enough space to make it worth deduping. But part of the argument is over how to define primary data. Slootman says some of his customers dedupe home directories, one of the use cases cited by GSC's Stazyk.
This was first published in September 2008