| DEDUPLICATION IS ALL the rage these days, while the good old-fashioned Lempel-Ziv (LZ) compression technology we all grew up with is taken for granted. The real differences between dedupe and compression have to do with algorithms or how each one does its job. But not many of us need to know algorithm differentials to do our jobs.
"I think there's a lot of confusion in the marketplace," says Arun Taneja, founder and consulting analyst at Taneja Group in Hopkinton, MA. "Most people aren't computer scientists."
Both technologies shrink data volumes. Compression does it by squeezing out repetitive bits in a data stream, cleaning redundant data within a file. Dedupe compares objects at the file and sub-file level, removes duplicate files by referencing the original and saves only one instance.
Standard compression technology provides about a 2:1 ratio, halving the number of bits in a stream. Both tape and disk can accommodate compression. Dedupe, which is strictly for disks (tape can't accommodate its serial nature), boasts ratios on the order of 30:1. "But it's not like one is a panacea and one is a dud," notes Taneja.
Now for the twist: Storwize and Ocarina Networks, two competing compression vendors, have rejiggered the LZ algorithm so that compression technology can be used in "those scenarios where performance matters," says Taneja, not just for backup data. Storwize calls itself a capacity optimization provider for primary storage. That means Storwize offers inline compression (that's radical) and promises no performance drawbacks.
Amit Bar-On, IT manager at Polycom Inc., a Pleasanton, CA-based video conferencing firm with 3,000 users, connected Storwize to his NetApp system. Right away, his 300GB of available storage jumped to 1.1TB. He says Polycom's data load was reduced by 65% across multiple apps.
Dedupe wasn't an option he needed, says Bar-On, although NetApp offers the capability for free in its system. "Dedupe is searching for files, and if you're doing that in the middle of the day, it can influence performance," he says. "I needed to compress all the data, not search the duplicate files, and I wanted something to run in real-time."