When flash-based storage first became a viable, high-performance option for the enterprise, it was sold as a static tier and workloads had to be manually moved into it. But, this was an inefficient use of a high-cost storage resource, since the workload being moved typically didn't require flash performance all the time. In addition, many data centers found it difficult to identify which workloads, or parts of workloads, should be moved to flash.
As a result of these challenges, flash was only purchased for extreme performance problems. Broadening the implementation of flash meant that vendors would have to provide some level of automation to ensure the most active data was moved in and out of the flash tier for more optimal use.
Vendors took two approaches to solving this problem -- tiering and caching. While the terms are often used interchangeably, they actually mean two entirely different things. Both models use an algorithm to determine which data should be on which type of storage. In some cases, this algorithm simply organizes data on a first-in, first-out basis, but many vendors have enhanced their algorithms to account for specific access patterns.
What is flash tiering?
In a tiering model, data is moved to the flash tier exclusively, meaning it never exists on both flash and hard disk drives. This creates a greater need for high availability because any failure of this tier can cause a data loss. Considering that the data being moved is the most active data, very critical information could be lost.
When tiering strategies first came out, all new or modified data (writes) were sent to the hard-disk tier first, then data was qualified via its read activity to be moved into flash. Most algorithms required multiple accesses over a short period of time to qualify for promotion. The problem with this approach, while it is better at securing the data to a known technology (hard disks), is that it also meant that write I/O would never be able to take advantage of flash performance.
As vendors and users became more comfortable with flash technology, they got used to the idea of sending all write I/O to the flash tier first and then demoting as needed to the hard-disk tier. This process allowed writes to experience the full performance of flash and is now considered "safe" given all the protection that goes into the flash tier.
What is caching?
Generally speaking, caching takes a more temporal approach than tiering. Where data could sit on flash for months in a tiered system, a cache will typically only hold data for as long as it's needed. There are three basic types of caching: write-around, write-through, and write-back.
Write-around caching means all data is written to the hard-disk area first and is copied, not moved, to the flash area as it is qualified, based on read activity. This means data always resides on the hard-disk tier, which is typically protected by RAID or mirroring. As a result, the cache area does not need the same level of reliability as described in the tiering model above. Also, because all data is written to the hard-disk area first, flash life expectancy should increase. Only data that is truly flash-worthy is written to flash.
It is important to note though that while a failure in the flash area will not mean data loss, it will mean a performance loss, since all subsequent I/O must come from the hard-disk area. Users that have become accustomed to flash performance may consider the performance loss to be almost as bad as downtime.
A write-around technique also means all data has to be qualified prior to being promoted to the flash area. So, writes will be slower in this model and the time it takes to experience flash performance will be delayed until enough I/O accesses occur on the data set to justify the copy to flash. Finally, the copy process requires its own set of storage controller processing and I/O bandwidth, which could create some system unpredictability as data is copied up to the flash tier.
Write-through caching attempts to solve some of the challenges with write-around caching. In that model, all writes are sent to both the flash and the hard-disk tier. As a result, data is still redundantly available on both tiers. This technique pre-positions the most active data on flash, so there is no need to prequalify data or copy it. Write performance is still dependent on the hard-disk area since write acknowledgment has to come from that tier. Also, flash endurance is reduced since all data now goes to the flash area, not just the data qualified to be there.
Write-back caching addresses the write performance issue by acknowledging the write as it is written to flash. This data is later written to the hard-disk tier during less storage I/O busy times. The risk associated with write-back caching is a failure occurring in the flash area prior to this data being written. Also, in a clustered environment like VMware, when a virtual machine is migrated, extra caution is needed to make sure the flash area is flushed prior to migration.
Like tiering, write-back caching also requires that greater high-availability standards be applied to the flash area, since there are times where it could hold a unique copy of data.
When to use which?
Most of the initial concern about flash-based storage being durable enough for the enterprise has long passed. And thanks to declining flash pricing, the cost of deploying redundant flash via RAID or mirroring is also less of a concern.
If you are considering a flash-enabled system, you can feel confident deploying either tiering or write-back caching, assuming you double-check that proper redundancy techniques are being implemented in the storage system. This will provide the best performance, which is why you invest in flash-based storage in the first place.
From a user perspective, there is no significant advantage to caching or tiering if the proper redundancy is in place, so a flash-based storage system should not be chosen on these features alone. IT planners may instead want to investigate other factors, such as the ability to pin certain data sets to the flash tier or cache areas and integration with the environment or application.
About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.
Flash-based storage: The good, the bad, the truth
How to overcome the limitations of flash-based storage
What to know about flash-based storage systems