In spite of what you may have heard, flash-based storage -- whether you're talking about memory chip-laden PCI Express cards aimed at servers, 2.5-inch disk form-factor solid-state
For flash storage to be applied successfully to an existing storage infrastructure, it must be used smartly. For example, flash can provide great caching services -- reading, writing or both -- and increase I/O per second if used correctly to accelerate accesses to "hot" data (data with multiple concurrent read accesses immediately after writing). However, flash can't fix poorly designed applications, improperly deployed databases or server hypervisors that slow down the I/O of guest machines by funneling their data through some sort of poorly implemented storage controller emulation.
What flash-based storage vendors have sought to do, with some success, is to work around the inherent limitations of their technology, including cell wear and non-linear performance. Cell wear refers to the tendency of flash memory cell locations to burn out after being written to approximately 250,000 times (in the case of multi-level cell chips). When a cell goes, a group of cells goes with it.
This problem is being addressed -- not so much fixed as circumvented -- by delivering products with much more storage capacity than advertised on the product literature and wrappings. The extra capacity, combined with some interesting RAID-like algorithms sometimes called wear leveling, permits fresh cell groups to be substituted for those that have been rewritten the maximum number of times.
Flash memory is also known for non-linear performance. The first time data is written to cell groups, it flies. No technology, with the possible exception of tape, can write data as quickly as fresh flash memory. That said, writing to the storage a second time is problematic for flash. What was a one-step process (write to cells) now requires several steps that include erasing previously written cells and then writing new data to them. The expected outcome is a 50% reduction in write speeds after the first write. Again, some sleight-of-hand algorithms have been brought to bear by many vendors to work around this problem, by enlisting fresh cells for every write from the pool of extra memory that isn't advertised while erasing older data from written cells in a background process.
The above isn't intended to cast aspersions on flash storage, only to approach it with clear eyes. In and of itself, the methods used to rectify the shortcomings of flash-based storage are no more egregious than the use of spoofing (caching ahead of disk) to give the appearance of a faster disk storage array. If it works reliably, it is what it is.
Workarounds, virtualization and hot data
It's important to note that all the workarounds in the world to the inherent limitations of flash-based storage are meaningless if the technology is applied to the wrong goal. Typically, memory-based storage is sought to speed up application performance -- a noble goal, provided the technology is applied to the right applications.
Poor application performance may have little to do with storage performance or I/O throughput. Looking at CPU utilization and queue lengths on hard disk drive storage can tell you a lot. If processors are waiting for data and I/O instructions are stacking up and waiting their turn to be executed at the disk, these may be tell-tale signs that some flash caching might help alleviate the log jam. Microsoft and Linux operating systems provide native facilities like xperf or blktrace that can help you to determine whether an application might benefit from flash caching or acceleration.
Leading vendors of flash-based storage gear usually provide utility software of their own that examines the application server and storage environments to spot opportunities for improvement. Some, like LSI's Nytro Predictor, will even provide advice on how much memory is needed to do the job.
Even with these tools, there is a trial-and-error aspect to applying flash storage to resolve application performance issues, and it's made worse by the introduction of server hypervisors and other intermediary software that may obfuscate the profiling of application I/O patterns. For years, vendors have sought to profile application workload so that services, including acceleration, could be applied in an elegant manner. To date, there's no standardized way to profile application I/O -- at least, none with the sufficient reliability to enable "atomic units" of storage to be applied when needed or when the need is forecasted.
Perhaps the easiest way to bring the general benefits of I/O caching to applications is to virtualize your storage. Using DRAM in the storage virtualization server and technology such as DataCore Software's adaptive caching, you can effectively deliver a three-to-four-times acceleration to all I/O transactions.
By combining such an architecture with underlying arrays that swap hot data from disk to flash solid-state drives while it's in high demand and then re-point requests back to the disk when the number of accesses drops off (known as "hot sheets" in X-IO ISE arrays), you can reduce the amount of disk you need to field in your infrastructure and, by extension, the load on your utility power. Tuning storage for facility power utilization efficiency may be a very important metric over the next few years, even more so than application I/O throughput.
This was first published in May 2013