This content is part of the Essential Guide: Essential guide to hybrid flash arrays
Problem solve Get help with specific problems with your technologies, process and projects.

Overcoming the unpredictable nature of the hybrid flash array

Eric Slack takes a closer look at what happens when the flash in a hybrid flash array 'misses.'

Hybrid flash systems are a popular choice for providing high-performance storage at costs closer to those of traditional all-disk arrays. Basically, these are storage systems (block or file-based) that contain both hard disk drives and NAND flash with a caching or tiering software layer that's charged with keeping the appropriate data on the flash tier.

But the consequences of a so-called flash miss, where the required data is not in the cache or on the flash tier when requested, can actually be worse than the consequences of that data being on a traditional disk array with no flash. The reason is that applications using that hybrid flash array are designed to expect flash performance. The overall performance that users experience depends on the high-speed interaction between the application software and the CPU, a process that requires consistent storage latency.

It's important to note that while a flash cache and a flash tier use different mechanisms for getting the right data onto flash before it's requested by the application, the impact of a flash miss is essentially the same. When data is not on flash when needed, storage performance is unpredictable. A flash miss that suddenly replaces flash latencies with disk drive latencies (which are an order-of-magnitude greater) can make storage performance very unpredictable. This inconsistency can have a more significant impact on overall application performance than could be attributed to simply slower storage.

This disparity between hybrid flash arrays and all-hard-disk-drive (HDD) arrays is compounded by the fact that most hybrid storage systems use a smaller number of higher-capacity disk drives and traditional, high-performance disk arrays are usually designed with greater spindle counts and faster drives. So, when that flash miss does occur, the application is subjected to the greater latency of the hybrid system's disk tier.

Why do misses happen?

The simple answer is that the flash area isn't large enough to hold all the data that users or applications are requesting. This is an economics issue; more flash adds to the cost of the array, making the hybrid that much more expensive than a disk array. But this also reduces the differential between the hybrid flash array and all-flash systems, which, of course, have no concerns about a flash miss.

If we dig a little deeper, we find that some caching or tiering systems are less efficient than others, meaning they're essentially wasting flash capacity. Other times, they're just less effective at deciding if certain data objects are "flash worthy." Or it may just be that the data being stored is just too unpredictable, highly random or sequential for software to accurately assess its flash worthiness.

But there are things hybrid array manufacturers can do things to minimize the chances of a flash miss. Obviously, these are design aspects that should be considered when evaluating a hybrid flash array.

Make the flash area bigger

Manufacturers can make the flash area larger, or better yet, support the expansion of flash capacity. This allows users to tailor their flash-disk drive mix to best meet their workloads. This also allows users to mix the type of flash deployed, for example, using a smaller amount of high-speed, high-endurance SLC flash for the initial creation of data and a less-expensive, larger MLC flash area.

Make the flash area look bigger

Another method is to increase the effective capacity of the flash area through such data reduction techniques as deduplication, compression and thin provisioning. These technologies have been used for years in disk storage, but they can be even more effective in flash, due to its great performance. Some all-flash arrays rely on data reduction to make their economics more appealing, so it's certainly appropriate for hybrid arrays as well. However, it's important that the storage system have enough CPU power available to run these processes, and still deliver the storage performance that users bought them for in the first place.

Make the flash area smarter

Caching and tiering leverage some of the most sophisticated software algorithms being developed today and there are some significant differences between those algorithms and the software being used in hybrid flash arrays. How the caching or tiering software works needs to be understood, especially with the specific data types most likely to be stored, because the effectiveness of this software can be data-dependent.

Some arrays have application-specific processes or software modules that know which data objects are the most important in a particular application, such as indexes or log files in a database. By giving these data priority, they can assure they're in flash when needed. As a last resort, these systems should allow users to "pin" certain data sets to flash to assure consistent performance for mission-critical applications.

Hybrid array controllers are pulling tough duty, because they're constantly evaluating and moving data into and out of the flash area, in addition to providing overall storage I/O performance for the system. For this reason, the way an array handles the entire process of data analysis and data movement should be as efficient as possible. Optimizing the controller code for flash and implementing special routines for handling metadata to improve performance both improve efficiency.

Another way to improve efficiency is to make the data objects smaller, so there's less unneeded data handled by the caching or tiering process. For example, some caching software can cache individual VMDK files, as opposed to caching at the virtual disk level, and still others can cache certain data within the VMDK itself.

Make flash the priority

Some tiering systems write new data to the disk area first, then promote it to flash as the data access reaches a certain threshold. This can mean that important data that's recently written to the system is not on flash for an extended amount of time. If new data instead were first written to flash, then demoted to disk, this "flash warm-up" period could be eliminated, decreasing the chances of a flash miss.

Hybrid flash arrays all use some form of caching or tiering software to help ensure that the data applications require is in flash when it's needed, thereby maximizing the performance impact of their flash capacities. But when that required data isn't in flash, the resulting flash miss can have a significant impact on the applications that were expecting faster storage performance. To address this, hybrid storage arrays have several ways to improve their flash hit rates.

Making the flash capacity larger can help, but so can making its effective capacity larger by incorporating such data reduction technologies as deduplication and compression into the storage system. Similar benefits can result from making the system more efficient by leveraging an awareness of common applications and platforms or using more specific caching or tiering routines. Finally, designing the system so that new data writes go first to the flash area instead of to disk can assure that important data objects are on flash in the fastest possible time frame.

Next Steps

Direct marketing firm solves VDI storage woes with hybrid flash array

Law firm makes case for hybrid flash array over all-flash array

Law firm uses NexGen hybrid array to keep cases moving

Dig Deeper on All-flash arrays

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Eric has a great point and is hitting on the evolution of new software capabilities that are required to manage flash effectively. As with all technologies, the first generation is always about speeds and feeds. Think back to x86 servers. Then as the technology matures delivering even better speeds and feeds, it typically outpaces the customer need. In the x86 example, VMware came in a capitalized on the fact that no one application could utilize all the x86 horsepower. Same thing is happening with flash, we've moved beyond speeds and feeds, the next iteration of flash innovation will be about being able to prioritize exactly what data belongs there and what doesn't.