kentoh - Fotolia
- Jeff Kato, Taneja Group
Five years ago, flash technology transformed the storage market forever. Today, flash-first arrays are the new normal. Will a new shared storage access protocol called nonvolatile memory express over fabrics combined with the advent of storage class memory prove as disruptive to traditional storage over the next five years as NAND flash technology was in the recent past?
When NAND-based SSDs first came to market, data was accessed using traditional block protocols, such as SCSI, and the SSDs were physically attached to array controllers and servers using SATA and SAS bus infrastructure. Also, locally attached PCI-based add-in cards were popular for server-side caching, predominately for storage acceleration.
As NAND flash evolved, the SCSI protocol itself started limiting flash storage performance. So the industry created a new block protocol called nonvolatile memory express (NVMe) that capitalizes on the performance characteristics of nonvolatile memory -- such as flash's ability to easily support data being accessed in parallel at a much greater degree than was ever imagined for HDDs. The initial target for NVMe is PCI Express bus interfaces to unlock the SCSI performance bottleneck.
Recently, new classes of nonvolatile memory -- such as 3D XPoint, which operates much faster than traditional 3D NAND flash and approaches dynamic RAM speeds -- have emerged. Called storage class memory (SCM), it also can be addressed at the byte level versus the page level of NAND memory. SCM enjoys a thousand times speed and durability advantages over NAND, but it comes at a premium cost over NAND, similar to how initial SSDs' pricing compared with HDDs. Using SCM only makes sense if it's accessed with the NVMe protocol.
Just as Fibre Channel and iSCSI extended the SCSI protocol to fabric, the NVMe protocol has been extended to fabric using a similar approach. The figure below offers a birds eye view of how NVMe over Fabrics (NVMe-oF) works.
NVM Express over Fabrics operates over Fibre Channel, TCP or remote direct-memory access networks and is, by design, transport agnostic. Applications written to take advantage of NVMe should operate over a fabric with limited latency impact. Since NVMe over Fabrics is relatively new, it will take time to gain native OS support from the likes of Windows and all flavors of Linux.
Why NVMe over Fabrics matters
Why should this matter to storage architectures? Because to achieve full SCM or newer-generation NAND flash performance requires that shared storage architectures change with advent of NVMe-oF. Most all-flash arrays boast latencies of less than a millisecond, and many leading AFAs have latencies of less than 500 microseconds. SCM-based arrays using NVMe-oF could improve latencies by another order of magnitude, approaching 50 microseconds.
Many external arrays were designed to put rich data services directly in-line to the data path, impeding the latency improvements promised by SCM. Initially, SCM will likely operate as a large cache minimizing the need for expensive battery-backed dynamic RAM. Ultimately, SCM-based storage will store long-term data, however, and need to include the rich data services available in AFAs. Just like flash coexisted with hard disk, creating flash-first hybrid arrays, SCM will become a fully functional tier of storage, with high-capacity NAND-based SSDs coexisting in the same array.
With that in mind, we recommend AFA vendors consider the following in designing their external controller-based storage, server-centric software-defined storage (SDS) and hyper-converged infrastructure (HCI) architectures.
Controller-based storage considerations
Separate the control and data planes. It is going to be critical to optimize data for throughput and latency. The best way to offer rich data services at performance is to keep control operations separate from data plane operations.
Reconsider what data services you put in-line versus post-process. SCM will enable some advanced data services to operate with acceptable post-process functionality while optimizing the overall performance. It may be better to land data first with minimal latency and leave storage optimization features such as compression and deduplication for later. Basic durability and encryption functionality would be exempt from this approach.
Push legacy protocols to the side. Future storage products should make NVMe-oF the optimized primary access protocol and then provide file, object and legacy SCSI block protocols using a gateway style approach.
SDS and HCI considerations
Embrace local-attached SCM and NVMe SSDs. Industry standard servers are where SCM and NVMe options first appeared. It will be much easier for SDS-based storage vendors to take advantage of these technologies and provide the most flexibility to modify software and mature the storage infrastructure.
Consider using JBOF (just a bunch of flash) that supports NVMe over Fabrics for composable direct-attached storage. NVMe-oF protocol chip suppliers, such as Mellanox, are enabling low-cost shared-storage JBOF. If SDS and, more importantly, HCI vendors embraced this technology, they could support configurations with flexible compute-to-storage ratios. This approach will also let them fully embrace popular blade server environments and make the compute-to-storage capacity composable on demand.
Consider NVMe-oF support across a cluster of nodes. SDS and HCI vendors may want to use NVMe-oF to enhance durability and capacity between adjacent server nodes by making locally attached NVMe devices accessible to the other nodes in the architecture.
Some vendors will pragmatically add in SCM and NVMe technologies much like they did during the transition from HDD to SSD. I'm also seeing others go radical with all-SCM array options with extremely low latencies, enabling 10 million IOPS in a 2U form factor. This performance density is mind-boggling, and I fully expect we'll see some new breakout storage vendors over the next couple years as these two transformative technologies mature and take hold.
Thanks to SCM and NVMe-oF, innovation is alive and well in the storage industry. I can't wait to see what the next five years holds. Now is the time to be asking your storage vendors how they plan to make the most of SCM and NVMe-oF and, just as importantly, if your storage purchase today is future-proofed for tomorrow.
NVM Express and NVMe over Fabrics: What you need to know
How the NVMe-oF protocol has changed the storage landscape
What to expect from NVMe and NVMe-oF protocols
- NVMe Technology in the Real World –Flash Memory Summit
- Deploying Stateful Containers with Kubernetes –Pavilion
- NVMe and NVMe-oF: Progress Over the Last Five Years, and Preparing for the Next... –Flash Memory Summit
- Optimizing NVMe-oF Storage with EBOFs & Open Source Software –Flash Memory Summit