While primary deduplication is not yet as common as deduplication for backup, it's rapidly becoming a feature built...
into storage arrays -- and one that customers are looking for.
The focus on deduplication is shifting from backup appliances to primary storage arrays, driven by the rise of virtual machines (VMs), virtual desktops and solid-state storage. Porting data reduction technology to primary storage proved difficult. NetApp Inc. has had primary dedupe since 2007 and Nimble Storage Inc. has used compression since the company came out of stealth in 2010. In the past year, vendors such as Dell Inc., EMC Corp. and Hitachi Data Systems have added deduplication to one or more of their storage platforms.
"It's becoming a competitive feature, and it's not an easy feature to put in," said Leah Schoeb, senior partner at Boulder, Colo.-based analyst firm Evaluator Group.
But it's a feature storage admins are increasingly relying on. Offering arrays with deduplication built in allows a customer to buy less capacity and curb overprovisioning.
"Customers are space-constrained, and [deduplication] translates into less space and less energy consumption," said David Noy, senior director of product management for EMC Isilon. EMC added its SmartDedupe primary deduplication feature to Isilon network-attached storage arrays in October 2013, and about one-third of the nodes or clusters that have been upgraded since then are using it, Noy said.
Because of commonalities between VMs -- which often run the same operating system -- dedupe can greatly reduce their storage footprint by eliminating redundancy. Dedupe also improves performance with VMs by allowing more storage blocks to be cached to memory.
The same goes for a virtual desktop infrastructure, where as much as 90% of the information stored on the virtual desktops is duplicated data. Deduplication -- particularly when done inline -- can greatly reduce that footprint.
Solid-state at the forefront
The increased use of expensive solid-state drives (SSDs) also makes dedupe more beneficial. When looking at storage from a cost-per-gigabyte perspective, its lower capacity and higher price puts SSD at the high end of the spectrum. Dedupe helps drive the cost per gigabyte down, while freeing up capacity on high-performing drives.
"The initial driver was to be able to not only extend the life of NAND flash technology so you can have the product longer and get longer warranties, but because for so long NAND flash had been more expensive -- they wanted to make sure they were getting the most use out of the capacity," Evaluator Group's Schoeb said.
According to Marc Staimer, president of Beaverton, Ore.-based Dragon Slayer Consulting, dedupe will become the norm among arrays containing SSDs.
"Almost every startup that's doing hybrid or full SSD is doing dedupe," Staimer said. EMC's XtremIO all-flash array includes inline deduplication, and startups SolidFire, Pure Storage and Nimbus Data have offered deduplication and compression in their arrays from the start.
"I do think [deduplication] is a trend, and that it's here to stay. And I think vendors that don't offer it yet are trying to figure it out," Evaluator Group's Schoeb said.