What you will learn in this tip: Automated storage tiering (AST) is an integrated feature in many vendors’ appliances, but each vendors’ AST option works differently. The majority of vendors don’t make automated tiering an optional feature in their appliances, but some are more customizable and controllable than others. Some IT professionals prefer to have a “set-and-forget” tiering model, while others want more control over the data movement in their storage environments. Find out about the different automated storage tiering products; decide which one is right for your data storage environment; and see how solid-state storage complements the tiering process.
Neither storage tiering nor AST are new technologies. In fact, Hewlett-Packard (HP) Co. claims to have implemented automated storage tiering in 1996. Nevertheless, the adoption of AST has been relatively slow. That’s because the earliest implementations required a significant effort to classify data and develop the policies that governed data movement between tiers. Most often, data was moved based on age, which is rarely the best arbiter of value.
Current auto-tiering implementations use sophisticated algorithms that calculate the usage of data chunks ranging in size from a 4 KB block up to a 1 GB block, depending on vendor and settings. This calculation is done based on access demand relative to other chunks, as there’s no definition of “high demand.” Data can be elevated to a higher tier during high demand periods and demoted when demand lessens. The quality of the algorithm determines the value of the product and the size of the block determines workload suitability. Smaller block sizes are generally better for random I/O, while larger sizes are better for sequential I/O.
Both established vendors and emerging vendors offer AST capabilities. Some of the newer vendors, such as Dell Compellent, have made automated storage tiering a cornerstone of their product architecture. With the company’s Storage Center product line and its Fluid Data Architecture, there’s only one array architecture and AST is an integrated part of it. Fluid Data Architecture data movement block size is a relatively granular 2 MB.
Similarly, for Avere Systems Inc., AST isn’t an optional feature in its FXT appliances. However, it adds the ability to use any network-attached storage (NAS) or JBOD array as tier 3 storage. Thus, Avere offers both inter- and intra-array tiering. In addition, Avere uses its own file system, which gives it an additional measure of control over data movement in its algorithm. FXT is a “set-and-forget” model that doesn’t allow user modification of movement policies, although tiers can be scaled separately to match workload changes.
For Greg Folsom, CIO at Arnold Worldwide, simplicity is the key issue. According to Folsom, Dell Compellent systems are “drop-dead easy” to install and manage. Arnold Worldwide, a Boston-based ad agency, uses a three-tier strategy with two different storage policies. “These things are so easy that even I can be talked through managing them when our storage manager is away from the office,” he joked.
Chris Elam, Arnold Worldwide’s senior systems engineer, began using Dell Compellent’s default automated tiered storage policies but tweaked them over time. Dell Compellent’s Enterprise Manager utility helped Elam identify usage patterns. “Enterprise Manager helped us to see exactly how data is accessed in the system. With this information, we created a tier 1-2 policy for some apps and a tier 2-3 policy for other applications. We’ve been using the system for more than four years and we haven’t had to change the policies in a long time,” Elam said. New volumes are simply assigned to one of the policies at creation time.
Solid-state storage complements tiering
Xiotech Corp. offers another example of a “set-and-forget” AST implementation. Xiotech’s Hybrid ISE product combines solid-state drive (SSD) and hard disk drives in a sealed 14.4 TB 3U container. Of the 14.4 TB, 1 TB is SSD and the rest comprises 900 GB 10K rpm SAS drives (tier 2). Controller-level software, called Continuous Adaptive Data Placement, automatically manages data placement from the moment of deployment. Although the company provides a graphical ISE Analyzer utility to highlight I/O activity, in practice a user can’t adjust any of the parameters or configuration. The company says it designed Hybrid ISE to never need tuning.
Among the vendors offering more configurable architectures, NetApp Inc. stresses the ability to scale performance and capacity separately. The firm’s Flash Cache (PAM II) product is analogous to tier 0 SSD in other product lines. Though it can support multiple tiers, NetApp said in many cases the tiers can be simplified to two: Flash Cache and either tier 2 or 3. That’s because they’ve found data tends to be either “hot” or “cold” and rarely in between. Buffer cache is used to buffer write activity to avoid performance degradation.
Data block movement size is the most granular at just 4 KB. Although this architecture may require more flash disk than other systems (10% to 20% of total capacity), the elimination of relatively expensive tier 1 hard disks and spreading cold data across more SATA drives can result in the same performance at a lower total cost. Moreover, NetApp combines AST with deduplication and compression on the spinning disk for even greater space efficiency. Because data is managed through the WAFL file system and Data Ontap, it doesn’t need to be “rehydrated” when being elevated from a lower tier to tier 0 as the data becomes hot. The same automated storage tiering capabilities apply across all NetApp product lines.
CERN, the European Organization for Nuclear Research in Geneva, uses NetApp’s Flash Cache on Oracle RAC databases. “Prior to using Flash Cache, we had to size everything based on IOPS regardless of storage utilization,” said Eric Grancher of the CERN IT department. “Now, we can optimize both IOPS and capacity. We have moved from expensive Fibre Channel drives to less-expensive SATA drives. This has resulted in a substantial savings for the organization.” Grancher has found the NetApp system to be very adaptive to workloads resulting in simple management. His experience has determined that overall performance is better when the flash memory is in the storage rather than in the servers. “It makes more sense to have the stable NetApp systems cache the data rather than the database servers, which are restarted more frequently for patching or updates. A data cache on the storage server is already ‘warmed up’ and so eliminates the inevitable periods of poor performance we would suffer with cold server-based caches after each restart,” he said.
EMC Fully Automated Storage Tiering (FAST) is another example of a more configurable system. FAST has an install wizard that allows you to implement default configurations for simple deployment, which EMC says the majority of users find sufficient in most cases for “set and forget.” Other users tap into FAST Tier Advisor, a utility that collects usage statistics over time. Those statistics can be used to apply optimized policies for specific applications. Users can also set the size of the data movement block from 768 KB to 1 GB, depending on whether the reads tend to be random or sequential.
EMC recommends that users start with approximately 3% of capacity in tier 0, 20% in tier 1 and 77% in tier 3. Tier Advisor will track usage and, over time, tier 1 should be minimized as little more than a buffer between the higher and lower tiers. In any event, Tier Advisor lets users optimize any of the tiers based on actual usage patterns.
Which automated storage tiering option should you choose?
Overall, vendors offer an array of different automated storage tiering options and capabilities with their products. Not sure which option to choose from? Ask yourself what your specific needs are for your environment and then match them to the product that best suits your needs. Knowing your performance needs will help you get the most out of your automated storage tiering product.
For more information on this subject, please read part two of this tip, "Automated storage tiering strategy: Inter-array tiering."
BIO: Phil Goodwin is a storage consultant and freelance writer.
This article was previously published in Storage magazine.