This article can also be found in the Premium Editorial Download "Storage magazine: Server virtualization strategies for storage managers."
Download it now to read this article plus other related content.
Automated storage tiering
Automated storage tiering is another mechanism for reducing data on primary storage. An array’s ability to keep active data on fast, expensive storage and to move inactive data to less-expensive slower tiers allows you to limit the amount of expensive tier-1 storage. The importance of automatic storage tiering has increased with the adoption of solid-state storage in contemporary arrays and with the advent of cloud storage to supplement on-premises storage. Automated storage tiering enables users to keep data on appropriate storage tiers, thereby reducing the amount of premium storage needed and enabling substantial cost savings and performance improvements.
There are a couple of key features to look for in automated storage tiering:
- The more granular the data that can be moved from one tier to another, the more efficiently expensive premium storage can be used. Sub-volume-level tiering where blocks of data can be relocated rather than complete volumes, and byte-level rather than file-level tiering, are preferable.
- The inner workings of the rules that govern data movement between tiers will determine the effort required to put automated tiering in place. Some systems, like EMC’s Fully Automated Storage Tiering (FAST), depend on policies that define when to move data and what tiers to move it to. Conversely, NetApp and Oracle (in the Sun ZFS Storage 7000 series)
- advocate that the storage system should be smart enough to automatically keep data on the appropriate tier without requiring user-defined policies.
Well established in the backup and archival space, data deduplication is gradually finding its way into primary storage. The main challenge that has slowed adoption of deduplication in primary storage is performance. “Dedupe and performance simply don’t get along,” said Greg Schulz, founder and senior analyst at StorageIO Group, Stillwater, Minn. Nonetheless, deduplication has found its way into a few storage systems and it’s simply a matter of time before others will follow.
NetApp offers a deduplication option for all its systems, and it can be activated on a per-volume basis. NetApp’s deduplication isn’t executed in real-time though. Instead, it’s performed using a scheduled process, generally during off hours, that scans for duplicate 4 KB blocks and replaces them with a reference to the unique block. Instead of generating a unique hash for each 4 KB block, NetApp uses the block’s existing checksum to identify duplicate blocks. To prevent hash collisions, which happen if non-identical blocks share the same checksum (hash), NetApp does a block-level comparison of the data in the blocks and only deduplicates those that match. As far as performance is concerned, “we can deduplicate an average 1 TB of data per hour,” NetApp’s Freeman said. NetApp’s deduplication is currently performed by individual volumes or LUNs and doesn’t span across them.
Similar to NetApp, Oracle features block-level deduplication in its Sun ZFS Storage 7000 series systems. But unlike NetApp, dedupe is performed in real-time while data is written to disk. “The overhead of deduplication is less than 7%, depending on the environment and amount of changes in the environment,” said Jason Schaffer, Oracle’s senior director of product management for storage. Among smaller players, BridgeSTOR LLC, with its application-optimized storage (AOS), supports deduplication.
Another vendor apparently committed to data reduction is Dell Inc. With the acquisition of Ocarina Networks in 2010, Dell picked up content-aware deduplication and compression technology, which it intends to incorporate into all its storage systems. “Starting the second half of this year, we’ll launch storage products with the Ocarina deduplication and compression built-in,” said Bob Fine, director of product marketing at Dell Compellent.
While the aforementioned companies developed or acquired data deduplication technology, Permabit Technology Corp. has developed Albireo, a dedupe software library it intends to license to storage vendors, enabling them to add deduplication to storage systems with the advantage of time to market and without the risk inherent in developing it themselves. “With Xiotech, BlueArc and LSI, we have three announced customers, and we expect first product shipments with Permabit deduplication later in 2011,” said Tom Cook, Permabit’s CEO.
This was first published in June 2011