BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Hybrid storage arrays balance price and performance between all-flash and all-hard disk storage. They provide the...
speed and latency of flash without the cost of a fully loaded solid-state system.
Although hybrid flash arrays are more complex to manage than all-flash arrays or all-HDD systems, the flexibility and lower cost make them worth considering. Hybrid flash array prices range from tenths of a cent per gigabyte to several dollars per gigabyte, and performance levels range from tens of megabytes per second to hundreds of gigabytes per second.
While the basic components of hybrid storage are flash- and disk-based storage, there are several tiers of flash storage; several more of disk-based storage; plus tape, cloud and offline tiers of storage from which to choose. That could result in up to 10 tiers of pricing and performance.
Most enterprises have many different storage workloads with varying performance requirements. Some workloads, such as real-time and online transaction processing with constant high loads, require constant high performance. Other workloads, such as virtual desktop infrastructure (VDI) and server virtualization, require bursts of high performance, and workloads such as user home directories require relatively low performance.
While it is tempting to divide storage arrays into flash and hybrid, there are several subdivisions within those categories. For instance, there are at least three kinds of flash, in order of descending performance: memory bus, nonvolatile memory express (NVMe) bus, and SATA and SAS bus, and some vendors add a DRAM-based storage component on top of those.
There are three speeds of hard disks typically used by storage vendors, measured in RPM at 15,000, 10,000 and 7,200, as well as both SAS and SATA versions, and versions with integrated flash or large amounts of cache memory that improve performance over base models.
Most hybrid storage systems use two tiers of storage or more to deliver the optimum performance for different workloads. Even all-disk systems with no flash typically have relatively large amounts of RAM built into the storage controllers to accelerate requests, and many all-disk systems now integrate flash, as well. Even some low-cost NAS systems include the option to add flash drives in addition to the spinning disks that come standard.
IOPS, scalability, resiliency and power consumption can all be affected by the tier and setup of a given partition. For instance, a volume or partition can be configured to replicate locally, at a backup site or in the cloud.
While SSDs have obvious performance advantages, including better throughput and IOPS, they offer some less obvious benefits, as well. Not only do they provide lower power consumption at idle, but they have the ability to go from idle to active in microseconds, while spun-down hard drives may take seconds to go from spin-down mode to active.
Depending on the manufacturer, storage management software may be called storage virtualization, auto-tiering or automatic data migration. Any of these will enable you to create virtual volumes or partitions and vary the amount and type of flash, HDD and other tiers to give different types of applications the performance they need. You can give real-time apps that need high performance and low latency an always-flash, or even an always-NVMe partition, and give an archive application an all-HDD partition. Or you can let the storage management software handle data placement, which enables you to give each type of application the optimum performance, whether through real-time analytics, online transaction processing, VDI or server virtualization.
Some storage arrays can connect to tape, cloud storage gateways or offline storage, such as removable media, including RDX or optical drives, to further increase the range of performance and cost to suit various applications. For instance, archiving applications can take advantage of cold storage in some clouds or serve as gateways to object-based storage in multiple data centers.
Hybrid storage arrays offer performance at lower costs
Tiering or automatic data migration will generally keep the most active data in the fastest available tiers, although data sets can be manually designated to run on specific storage. You might want to do that, for instance, if you want to always keep database indices in an all-flash partition. By moving the most-used data to the fastest storage, any data in use will get the best performance available.
Using extensive research into the hybrid flash storage market, TechTarget editors focused on market-leading vendors and other well-established enterprise vendors. Our research included data from TechTarget surveys, as well as reports from other respected research firms, including Gartner.
Because most data follows the 80/20 rule, the 20% of the data that is generally in use residing on a flash tier will effectively provide great performance for all the data, and the 80% that is declining in use will be automatically moved to a faster tier if it is accessed.
The actual percentages will vary, but keeping the 20% most active data in flash requires only 20% or less of the total capacity to be flash. Because flash is fast enough to support compression and deduplication without performance degradation, you can gain a two and a half to six-time reduction in the amount of flash needed to hold 20% of the total data.
Where does cloud storage fit in with hybrid storage arrays?
Many storage vendors are adding cloud gateways to their storage systems. These gateways enable the storage administrator to create rules-based data migration from internal storage in the data center to lower-cost, hard-disk-based storage, and eventually to even lower cost cloud storage. The cloud can be used for cold storage or backups, both of which use inexpensive media.
How hybrid storage arrays are sold
The typical model is a single appliance, which can be a small box with a controller and up to 24 drives, or a separate controller box with two or more controllers, plus multiple shelves filled with drives. These systems can be expanded with additional controller-plus-drive appliances, or in the more modular systems with additional controllers, shelves of drives or both. Enterprise-grade systems might have as many as 16 pairs of controllers, each with multiple shelves of 24 or more drives attached.
But it is possible to save substantial amounts of upfront hardware costs by using a software-only product, such as DataCore's SANsymphony, with existing storage attached to servers, in appliances or in various SAN storage systems. System set up and maintenance require considerably more expertise than a single-vendor SAN system, but if there are large amounts of pre-existing storage, it can amount to considerable cost savings. Some systems allow the user to buy a system for a relatively low price, with only a small number of disks activated, and to add more capacity as needed.
The major market shareholders -- including Dell EMC, Hewlett Packard Enterprise, IBM, Hitachi and NetApp -- have broad and deep lines of products that can provide anywhere from a few terabytes of storage, suitable for a remote or branch office, to a huge, data-center-grade system that can handle tens of thousands of users and millions of transactions per second. Smaller vendors, such as Infinidat, Tintri and Tegile, which was recently acquired by Western Digital, have fewer models and a more narrow focus on the types of customers they attract.
Factors to consider when determining the true cost of flash
Storage tiering and SSD caching can help improve performance
Is there still room for hybrid arrays?