By now it’s clear that all major storage vendors will support flash in their systems. But the debate rages over whether flash should be as cache or as persistent storage.
Earlier this month NetApp revealed plans to support solid-state Flash drives as cache and persistent storage in its FAS systems beginning next year. The cache model will come first.
“We believe the optimal use case initially lies in cache,” says Patrick Rogers, NetApp VP of solutions marketing. Netapp has developed wear-leveling algorithms that will be incorporated into the WAFL. WAFL’s awareness of access frequency and other characteristics for blocks will allow it to use both DRAM and flash, with flash as the “victim cache” — a landing spot for blocks displaced from primary DRAM cache.
Why not just use DRAM? “If you have a very large amount of data, and you can’t accommodate it entirely in [DRAM] cache, flash offers much higher capacities,” Rogers says.
- What read hit ratios and repetitive reads of a block are required to overcome the NAND write penalty?
- How will accelerated cell wear-out be avoided for NAND-based caches?
- What would be required to use NAND flash as a write cache – do you have to implement some form of external data integrity verification and a means to recover from a damaged block (e.g., mirroring writes to separate NAND devices, etc.)?
I asked Burke to answer his own questions when it came to Flash as persistent storage, which is EMC’s preference so far. He answered me in an email:
- Overcoming the Write penalty – not an issue, because storage arrays generally always buffer writes, notify the host that the I/O is completed and then destage the writes to the flash drives asynchronously. Plus, unlike a cache, the data doesn’t have to be read off of disk first – all I/O’s can basically be a single direct I/O to flash: read what you need, write what’s changed. As such, reads aren’t deferred by writes – they can be asynchronously scheduled by the array based on demand and response time.
- Accelerated Wear-out – not an issue, for as I noted, the write speed is limited by the interface or the device itself, and the drives are internally optimized with enough spare capacity to ensure a predictable lifespan given the known maximum write rate. Also, as a storage device, every write into flash is required/necessary, whereas with flash, there likely will be many writes that are never leveraged as a cache hit – cache will always be written to more than physical storage (almost by definition).
- Data Integrity – again, not an issue, at least not with the enterprise drives we are using. This is one of the key areas that EMC and STEC collaborated on, for example, to ensure that there is end-to-end data integrity verification. Many flash drives don’t have this level of protection yet, and it is not inherent to the flash technology itself. So anyone implementing flash-as-cache has to add this integrity detection and recovery or run the risk of undetected data corruption.
I also asked NetApp for a response. So far no formal response to Burke’s specific questions, but there are some NetApp blog posts that address the plans for Flash deployments, one of which links to a white paper with some more specifics.
For the first question, according to the white paper, “Like SSDs, read caching offers the most benefit for applications with a lot of small, random read I/Os. Once a cache is populated, it can substantially decrease the average response time for read operations and reduce the total number of HDDs needed to meet a given I/O requirement.”
Not as specific an answer as you could hope for, but it’s a start. NetApp also appears to have an offering in place for customers to use to determine which specific applications in their environment might benefit from Flash as cache, called Predictive Cache Statistics (PCS).
As to the second question, according to the whitepaper, “NetApp has pioneered caching architectures to accelerate NFS storage performance with its FlexCache software and storage acceleration appliances. FlexCache eliminates storage bottlenecks without requiring additional administrative overhead for data placement. ”
Another precinct was also heard from in the vendor blogosphere on these topics, with a comment on Chuck Hollis’s blog late last week. With regard to the write penalty, Fusion-io CTO David Flynn argued that the bandwidth problem could be compensated for with parallelism–i.e. using an array of NAND chips in a Flash device .
Latency, on the other hand, cannot be “fixed” by parallelism. However, in a caching scheme, the latency differential between two tiers is compensated for by choice of the correct access size. While DRAM is accessed in cache lines (32 bytes if I remember correctly), something that runs at 100 times higher latency would need to be accessed in chunks 100 times larger (say around 4KB).
Curiously enough, the demand page loading virtual memory systems that were designed into OS’s decades ago does indeed use 4KB pages. That’s because it was designed in a day when memory and disk were only about a factor of 100 off in access latency – right where NAND and DRAM are today.
This is an extension of the debate that has been going on all year about the proper place for solid-state media. Server vendors such as Hewlett-Packard argue that Flash used as persistent storage behind a controller puts the bottleneck of the network between the application and the speedy drives, defeating the purpose of adding them to boost performance. And round and round we go…at the rate it’s going, this discussion could last longer than the disk vs. tape argument.