This article can also be found in the Premium Editorial Download "Storage magazine: Server virtualization strategies for storage managers."
Download it now to read this article plus other related content.
Compression shares many of the challenges of deduplication in primary storage. Like deduplication, compression has a performance overhead; it’s limited to a volume and whenever data is moved out of that volume, it has to be decompressed, just like deduplicated data has to be deduped when moved from one volume to another. In an ideal world, different tiers, including backup and archival tiers, should be able to accept and deal with compressed and deduplicated data, but because of a lack of standards, they usually don’t.
Compression and deduplication are complementary technologies and vendors that implement deduplication usually also offer compression -- BridgeSTOR, Dell, NetApp and Sun all do. While deduplication is usually more efficient for virtual server volumes, email attachments, files and backup environments, compression yields better results with random data, such as databases. In other words, deduplication outperforms compression where the likelihood of repetitive data is high.
In addition to the above vendors, EMC Corp. offers compression in its VNX Unified storage products and with the single-instance storage feature for file-based content, which enables storing single copies of identical files, it does offer some level of deduplication. IBM offers its Real-time Compression Appliances (STN6500 and STN6800) to front-end NAS storage; the appliances and the compression technology came to IBM via
“The Storwize real-time compression software will be a software feature on some IBM arrays later this year, and it will be available across all lines within 18 months,” said Ed Walsh, director of IBM’s storage efficiency strategy.
A blend of new and old techs
Data reduction on primary storage is a reality today and with the unchecked growth of data, it will undoubtedly become a key part of storage efficiency. Data reduction features like RAID 6, thin provisioning, efficient clones and automated storage tiering are becoming must-haves and should be on anyone’s feature list when evaluating a primary storage system. Data deduplication and compression, on the other hand, are emerging technologies that will become more pervasive over time, but right now these relative newcomers are just beginning to have an effect on primary storage.
BIO: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at firstname.lastname@example.org.
This was first published in June 2011