Hitachi Data Systems is offering primary deduplication in its Hitachi NAS Platform, a feature that was more than...
two years in the making.
That's about par for the course for primary dedupe. There was a lot of noise around the technology in 2010, followed by a lot of silence.
Hitachi NAS (HNAS) is the network-attached storage (NAS) platform Hitachi Data Systems (HDS) gained when it acquired BlueArc in 2011. BlueArc first revealed a license deal with primary dedupe software vendor Permabit in August 2010. HDS at that time sold BlueArc systems under an OEM deal, but BlueArc remained an independent company.
BlueArc still hadn't implemented dedupe at the time of the HDS acquisition, and it took HDS engineers more than a year to make it ready for commercial use. HDS made sure the primary dedupe didn't impact performance by offloading much of the performance to hardware.
Michael Hay, vice president of product planning for HDS, said Hitachi NAS uses Permabit's hash database, but "all the other heavy lifting is done by Hitachi." That includes using hardware acceleration and the ability to read deduped data without having to rehydrate it.
HNAS uses an object-based file system offload engine powered by field-programmable gate arrays (FPGAs) to accelerate the hashing and chunking involved with primary dedupe. A basic hashing/chunking engine is free with HNAS, and additional engines for parallel hashing/chunking are available for an additional license.
"We have a unique architecture with HNAS," Hay said. "Our file system is overlaid on two types of processors, and that gives us the ability to dedupe without compromise."
State of primary dedupe
The summer of 2010 was a busy time for primary dedupe news. Dell acquired primary dedupe vendor Ocarina Networks and IBM bought primary compression startup Storwize two weeks apart in July 2010. Permabit revealed its deal with BlueArc a few weeks later. It also had an OEM deal with LSI for its Engenio storage, but that dissolved when NetApp acquired the Engenio product line in 2011.
IBM sells the old Storwize products as its Real-Time Compression appliances, but Dell has yet to add Ocarina dedupe to its EqualLogic or Compellent storage arrays.
"Primary dedupe is in no man's land right now," according to Taneja Group consulting analyst Arun Taneja. "I don't think it's made big strides yet. I think eventually it will happen, but most vendors haven't found a way to do it without affecting performance."
Permabit is pushing on, looking for OEM deals in other places. The vendor is talking to SMB storage vendors about using its software for Linux-based NAS systems, and Vice President of Marketing Mike Ivanov said he expects one announcement around mid-2013. He said several flash array vendors are also evaluating Permabit's dedupe software in hopes of bringing down the per-gigabyte cost of flash by optimizing capacity. However, there is no guarantee that any of those who use the software will identify it as Permabit's.
"A lot of people we're talking to have attempted to do their own dedupe and failed," Ivanov said. "They will never let the world know we're the OEM."
Primary dedupe is showing up in non-traditional storage systems. Startup GreenBytes uses what it calls "zero latency" inline dedupe in its IO Offload Engine built to improve performance of virtual desktop infrastructure storage. GreenBytes CEO Steve O'Donnell said his company actually began with designs on dedupe for backup. The dedupe algorithm it developed worked with little latency, which doesn't help much for backup, but is valuable primary data.
"For backup, zero latency is irrelevant. Nobody cares if it's 20 minutes of latency," O'Donnell said. "But for primary storage, our dedupe technology makes a lot of sense. It's a unique differentiator and perfect for highly virtualized workloads."