This article can also be found in the Premium Editorial Download "Storage magazine: Slimmer storage: How data reduction systems work."
Download it now to read this article plus other related content.
DRIPS and SSD
One impact of using space-reduction techniques is the increase in I/O density, specifically the random I/O it creates. I/O density increases with thin provisioning as the unused space is eliminated. Deduplication creates more random I/O, as the locations of the duplicate and single-instance blocks are unpredictable -- and it becomes more random over time.
Solid-state and dedupe: A good match
Data dedupe is enabling solid-state storage array vendors to compete at a $/GB ratio compatible with today’s high-end hard disk drive storage arrays. Once flash technology is trusted and widely adopted, data reduction techniques will increase their appeal and make today’s high-end disk-based arrays a much tougher sell for many firms.
Solid-state drives (SSDs) are a great fit for random I/O profiles, making them suitable for deployment in storage arrays implementing deduplication. There’s no latency penalty in handling random versus sequential I/O, and therefore no reduction in performance in managing deduplicated data. Dedupe also changes the effective $/GB ratio in terms of storage costs. Vendors are using deduplication ratios to reduce the $/GB cost of their storage and boost the appeal of their arrays to a wider audience. Prospective customers should be wary of accepting pricing based on deduplication ratios without an understanding of potential savings from their data, as savings may not
NetApp Inc. was the first vendor to implement dedupe of primary data in its arrays, starting way back in May 2007. The feature was originally known as A-SIS (Advanced Single-Instance Storage) and performed post-processing dedupe of data at the 4 KB block level. Initially, A-SIS was restricted by platform and to smaller volume sizes than the filers would support without A-SIS installed. This was to ensure performance remained consistent; it was well known that performance could degrade as A-SIS-enabled volumes reached capacity. These restrictions have been eased as more powerful hardware has become available. NetApp also supports thin provisioning, a feature that was significantly expanded with the introduction of aggregates to Data Ontap 7.
In 2010, Dell Inc. acquired Ocarina Networks, which had developed a standalone deduplication appliance that could be placed in front of traditional storage to provide inline deduplication functionality. Since the acquisition, Dell has integrated the Ocarina technology into a number of product lines, including the Dell DR4000 for disk-to-disk backup and the Dell DX6000G Object Storage Platform, a storage compression node for object data. Dell has stated its intention to add primary data deduplication functionality to its EqualLogic and Compellent lines of storage arrays, which already support thin provisioning.
EMC Corp. has had data deduplication in its backup products for some time; however, only the VNX platform offers data deduplication in primary storage and it’s limited to file-based deduplication from the part of VNX that came from the now-defunct Celerra hardware. Although EMC has discussed its intention to implement deduplication, no firm announcements or details have emerged.
Oracle Corp. has had the ability to use deduplication in its storage products since 2009, when it acquired the ZFS file system as part of the Sun Microsystems takeover. The Sun ZFS Storage Series 7000 appliances support inline deduplication, compression and thin provisioning. The ability to deduplicate using ZFS is also available to storage vendors using the technology within their storage products. This includes Nexenta Systems Inc., which released deduplication in NexentaStor 3.0 in 2010. GreenBytes Inc. is another startup using SSDs within its storage arrays in combination with ZFS to deliver deduplication functionality.
This was first published in October 2012