NetApp Inc. has added support for its free data deduplication utility to its V-Series gateways, which are diskless filer heads that can make another vendor's disk array look like a NetApp box. Some of the third-party disk array vendors the V-Series can front are EMC, Hitachi Data Systems, IBM, Hewlett-Packard, 3PAR and Fujitsu.
While fronting a Tier 1 disk array with a data deduplication box doesn't seem like a mainstream use case, about 60% of the 2,500 NetApp customers who have turned on data dedupe use it for primary storage, said Chris Cummings, NetApp senior director of data protection solutions. While there might be performance hit for data deduplication, NetApp uses a post-process approach that consolidates data processing into off hours.
"This won't be a flagship piece of our portfolio -- just another option for customers," Cummings admitted.
NetApp still has several key items on its data deduplication roadmap, including support for its virtual tape libraries (VTL), deduping across FlexVols to improve efficiency in larger systems, a GUI and data dedupe monitoring, Cummings said.
NetApp's data deduplication is currently running on his company's clustered FAS3020 filer, said Jim Krochmal, manager of IT for Polysius, a designer and installer of cement plants and equipment. The company dedupes CIFS volumes, including Office documents, CAD designs and user home directories.
Krochmal hasn't seen performance penalties for using data dedupe with production data, and deduplication across FlexVols hasn't been an issue for him. "For our configuration, every volume has 1 TB or less," he said. "We're small enough where that doesn't hit us."
However, Krochmal is hoping to see a GUI and data deduplication performance and capacity monitoring sometime soon. "Everything uses a text console right now," he said.
Cummings responded that for now, data deduplication performance throttling is handled by the system automatically until NetApp provides a GUI and dedupe monitoring.
Gartner analyst David Russell said NetApp may have placed integration with V-Series ahead of its other stated goals for data deduplication in order to boost sales of a product that's lagging behind.
"Among storage virtualization products, our research data shows about 12,000 IBM SVC, and about 9,300 HDS USP and NSC55 deployments to date, compared with about 2,000 for V-Series," he said. "This integration is probably meant to push V-Series and is also part of NetApp's campaign to go from a NAS to an infrastructure supplier."
NetApp is "a $4 billion company that a lot of very senior IT leaders have never heard of," Balaouras said. "They need to explain how they solve broad IT challenges and be seen as a strategic IT partner, not just a storage vendor or supplier."