Using hybrid storage

There's more than one way to build a hybrid storage array, writes Jon Toigo.

You've probably heard the term hybrid storage bandied about. It's used just about any time storage media -- tape, DRAM, flash memory -- is cobbled together. For example, the latest Linear Tape File System-enabled tape storage systems typically include a server front end or "head" that features a few trays of hard disk drives. These drives serve as a pre-fetch file cache so requested files can stream to the user from the disk cache while the full file is located on tape and positioned in front of the tape drive's read-write head.

That's one example of hybrid storage. Another is the use of memory, whether DRAM or flash solid-state drives (SSDs), in connection with hard disk drives (HDDs), as a mechanism for increasing throughput between application software and storage. Prominent network-attached storage (NAS) product vendors have done similar things, with application writes initially made to non-volatile memory cache until the file data can be positioned on HDDs in conformance with the NAS vendor's file layout and RAID scheme. This is called spoofing, and it's typically done to optimize the performance of NAS storage.

Some storage virtualization approaches use a similar spoofing approach, giving what seems to be a speed boost to I/O by caching all writes until physical storage resources can be written. This architecture is also called hybrid storage.

Reduce disk storage energy demands with flash hybrids

Hybrid storage can be used to discriminate rigs that feature combinations of flash SSD and HDDs (flash hybrids or hybrid drives) from "pure" or all-flash SSD arrays. Flash hybrids typically use their flash SSD components to complement their disk array components via intelligent read caching. If the data written to disk receives multiple and concurrent accesses, intelligent software in the hybrid array copies the data into an SSD and re-points access requests to the SSD. This allows requests to be served by the higher IOPS available from the SSD component. When the accesses fall off over time, new requests are re-pointed to the location of the data on the hard disk components of the array, and the copied file is erased from the SSD cache.

This scenario holds great promise for reducing the energy demands of disk storage by minimizing the number of hard disks that must be deployed to achieve high IOPS rates. For example, one vendor has striped together 1,900 disk drives and currently offers a speed of 450,000 IOPS from its flagship storage array. Throughput, according to the vendor, can be increased even more by short-stroking the disks -- restricting the area of the platter to which data can be written, which in turn restricts the traverse of the read-write head in each hard disk. The result is quicker head seeks and faster performance.

But the impact of all-disk performance storage is a huge energy cost. Each drive, whether short-stroked or not, requires about 7 watts of electrical power; 1,900 drives therefore represent a pretty significant power draw.

The appeal of hybrid storage -- described above as a cobble of flash-SSD read caching, HDDs and smart "hot data" copy-and-caching technology -- is that you can obtain the same IOPS as arrays with thousands of drives from a kit using only a fraction of the number of HDDs. The price is usually a tad steeper per gigabyte for the hybrid storage array and an all-HDD array, but energy cost savings in the first year can make the differences irrelevant.

Before using this kind of hybrid storage, you need an understanding of what applications would benefit from higher IOPS. Transactional systems are clear candidates, though many firms are deploying faster hybrid storage behind virtualized servers in a desperate effort to increase throughput. The problem with using hybrid storage to speed up the processing of virtual server workloads is that the hypervisor typically creates the performance issues in a server hypervisor setting. Technically speaking, hypervisor log jams can't be addressed in a meaningful way by faster storage. It may be useful to try a hybrid storage vendor's kit to determine whether it will make a noticeable difference in workload processing.

Some flash product vendors provide software that complements operating system tools, such as Microsoft xperf or Linux blktrace, and that can be used to see which application workload is I/O bound and therefore amenable to acceleration techniques implemented on storage components. Before jumping into hybrid storage, it's a good idea to become familiar with xperf and blktrace, tools like LSI Nytro Predictor and other vendor utilities that can give you data on whether the investment will be worth your time and money.

Dig Deeper on Storage tiering