NetApp Flash Pool and Flash Cache: A Deep Dive
NetApp has introduced several important innovations since first incorporating flash technology in 2009. On the software side, we have focused on the use of intelligent caching to maximize storage infrastructure efficiency. In hardware, our innovations have focused on high performance and reliability. This article provides a closer look at the hardware elements that enable our hybrid flash arrays: NetApp® Flash Cache™ and NetApp Flash Pool™.
NetApp Flash Cache
Flash Cache modules are three-quarter-length PCIe cards that fit into NetApp FAS and V-Series storage controllers. NetApp Flash Cache modules currently range up to 1TB in capacity, enabling up to 8TB of flash capacity per controller.
Flash Cache modules contain SLC flash, managed by an on-board FPGA. This FPGA controls all communication between system main memory and flash memory. The FPGA was designed for speed, providing critical functionality in several areas.
- The Flash Cache FPGA interleaves writes throughout multiple write queues, resulting in balanced flash erase, write, and read cycles. This allows many flash modules to operate in parallel and addresses the slow erase cycles inherent in flash devices.
- The FPGA supports multiple memory interfaces, each going several banks deep. When one flash bank on an interface is busy, the FPGA can issue a command to another bank on the same interface. This prevents stalls when requests bunch up.
- Within Flash Cache, the FPGA does not read from flash cells individually, but in groups that are striped across multiple flash banks, reducing read latency by over 800%.
NetApp systems employing Flash Cache have enabled throughput in excess of 250,000 IOPS at low latency, and they have proven that high-performance FC/SAS HDDs can often be replaced with fewer high-capacity SATA HDDs without sacrificing performance.
NetApp Flash Pool
Flash Pool combines HDDs and SSDs within a single storage system and provides a choice of using SLC or eMLC SSD modules. Using SSD and HDD within a single aggregate allows hot data to be tiered automatically between HDD and SSD. Like Flash Cache, hot data is not “moved” to SSD; rather, a copy of the data is created in SSD. Copying data is faster than moving it, and, once hot data is ejected from the SSD, there is no need to rewrite it back to HDD.
Unlike Flash Cache, however, both reads and writes can be cached in Flash Pool SSDs. When a write operation occurs, logic determines if it would be faster to write to both SSD and HDD together or just to HDD. Conversely, when a read operation occurs, logic determines whether data should be cached to SSD or read directly from HDD without caching. In all cases, Flash Pool algorithms are speed-optimized.
Making Flash SSDs Reliable
When using flash-based SSDs, reliability is a major consideration. When NetApp decided to offer flash SSD solutions, we knew our customers would require enterprise-class reliability. We also knew that, as with our HDDs, we had to back our SSDs with a five-year warranty.
We approached flash SSD quality from several fronts. First, we worked closely with SSD vendors to make sure their architectures had adequate safeguards. Then we did our own extensive testing to validate SSD designs. Next, we changed the way our systems handle SSDs (relative to HDDs) for two internal Data ONTAP® features: Disk Sanitize and Maintenance Center. Although Sanitize is great for scrubbing data from HDDs, it is slow when applied to SSDs and requires too many erasure cycles for a flash device to endure. So for SSDs we came up with a new way to scrub data that preserves those valuable erasure cycles.
Maintenance Center provides proactive disk health diagnostics and monitoring. If an HDD begins behaving badly, it is removed from service without disruption to applications and logically assigned to Maintenance Center for evaluation. The Maintenance Center software determines whether disk errors are transient and recoverable, or if they are an early indication of a failure condition that requires physical replacement of the HDD.
When an SSD acts up, however, it is immediately flagged for replacement and sent directly to our service engineers for extensive failure analysis. We want to learn more about how and why SSDs fail so we can identify and correct trouble early. Although the SSD failure rate has been extremely low, we know that early-failure analysis is important when using new technology.
Finally, we developed advanced SSD reporting, available through our AutoSupport™ capability. Our SSDs have new parameters that enable customers to estimate the projected lifetime remaining for each SSD, enabling proactive replacement of any SSD approaching the end of its useful life.
NetApp Flash Solutions
NetApp offers a broad range of flash-optimized storage solutions designed to increase application performance while maintaining high levels of reliability, including hybrid flash arrays, server-side flash, and all-flash arrays.
For more information on NetApp flash solutions, visit our Tech OnTap® community for monthly updates on best practices, technical case studies, and in-depth interviews with engineering experts.
Larry Freeman, Senior Technologist at NetApp
A frequent speaker and author, Larry’s current role at NetApp is educating IT professionals in the latest trends, techniques and best practices in data storage technology. He has authored the book Evolution of the Storage Brain, and hosts the popular blog About Data Storage.
© 2013 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, Go further, faster, AutoSupport, Data ONTAP, Flash Cache, Flash Pool, and Tech OnTap are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such.