12_tribes - Fotolia
Toigo Partners International
Published: 03 Oct 2016
Last summer, I was on a quest to become more knowledgeable about flash memory, especially memory-based storage products that seem to be catching the eye of many of my clients. NVMe, or nonvolatile memory express -- shorthand for the Non-Volatile Memory Host Controller Interface specification that describes a way to do flash memory storage using PCIe architecture directly (that is, without an SATA disk controller) -- seems to have energized a lot of flash vendors. Just last June, NVM Express mavens delivered a specification for doing NVMe over Fabrics that some view as a key to building more scalable flash platforms.
Here's the thing, though: On closer examination, the evolution of all of these "standards" doesn't seem to be driven by any sort of agreed-upon problem requiring a common fix. It may be a case of engineering for its own sake.
What we know today
Flash memory's becoming cheaper to own than disk (at scale), according to vendors, and prices continue to fall. Flash is good at reads, less so at writes, however. The inefficiencies of writing data to flash has led to a lot of engineering by vendors -- not of techniques for addressing the fundamental limitations of flash itself, but for spoofing and caching to make the inefficiencies seem less relevant.
One storage executive I've spoken with made this observation, "Even as flash chip technologies advance, write performance gets worse. So a lot of work is being done on controllers that spoof the write problem and mediate performance." At the end of the day, he said, the vendor community is counting on the falling price of the storage kit, together with an incremental improvement in performance over disk-based storage I/O and latency, to drive flash acquisition.
From this perspective, you can construe NVM Express as another attempt to improve flash performance, though it isn't clear that this theoretical advantage will mean anything in the real world. Simply put, NVMe is an effort to eliminate the SATA stack from flash storage data puts and gets, and to attach flash storage directly to the PCIe bus.
Sounds simple enough, but the backstory is a bit more telling.
With the advent of flash memory as storage in the early 2000s, every vendor tried to create its own technology -- controller, software-stack, driver and so on -- for accessing flash without using the SATA interface. They bragged that their proprietary technique was unique, more agile and faster than competitors, and much better than emulating a hard disk and disk controller architecture. Flash deserved its own interface "to unlock its native parallelism." This was a "great proprietary opportunity," to quote my veteran industry insider, which resulted in a bunch of intellectual property for firms like Fusion-io.
However, in what some might call a rare display of consumer common sense (or a side effect of the Great Recession), proprietary flash technology quickly fell out of favor. Large firms simply didn't want to purchase something that would lock them into a particular vendor -- unless, it appears, the technology was from Intel.
When Intel announced that it was working on its own PCIe interface for flash memory that would use fewer buffers and a standardized driver stack at its Developer Conference in 2007, it would've been a good idea for investors to pull the plug on a lot of these burgeoning flash storage companies. Clearly, Intel was going to get the main benefits from a "standardized" NVM Express specification, which they have since the initial spec's release in 2014.
A skeptical mind like mine sees the evolution of NVMe as being driven by Intel's desire to leverage its weight to "own" flash on PCIe, which is (for better or worse) exactly what's happened. It remains to be seen what practical benefits NVMe will deliver, though, something that hasn't been discussed very much.
Some will counter that the original goal of NVM Express was to enable flash to operate at its actual speeds and feeds, without the logjam created by having to pass I/O through a SATA bus. However, variants of this argument have been used to justify virtually every change and technology -- from software-defined storage and hyper-converged infrastructure to NVMe -- that's been offered to the I/O path and storage infrastructure for the past decade: "Legacy storage is too slow. We must have more speed to make applications go faster."
Only, the truth is a bit different.
Consider that application performance, particularly database performance, appears to be (in most cases) gated not by storage I/O at all. Truth be told, outside of high performance computing environments, we rarely create enough application or storage I/O to saturate the storage I/O interconnect or even create significant queues or logjams.
Remember when SATA II and SATA III drives began to appear a few years ago? Consumers filled discussion boards with questions about compatibility between the latest SATA devices with motherboards offering the slower, older versions of the interconnect technology. Honest techs from system vendors quickly pointed out that no one was submitting data fast enough to "flood" the less performant interfaces in the first place. They didn't see SATA as non-performant or as the cause of much latency in storage. Standards kept improving -- as a function of engineering improvements, though, not to address any real computing deficits.
Flash forward a couple of years. Despite the efforts of VMware to attribute terrible virtual machine performance to storage, the two usually displayed no correlation, let alone any (provable) causal relationship.
As demonstrated three times now by DataCore Software, the real I/O logjam is at the end of the I/O path called raw I/O -- that is, the location where a processor core processes application instructions, generates I/O and moves those I/O requests out onto the I/O bus for execution. DataCore recently generated five million IOPS with the Storage Performance Council's SPC-1 Benchmark by using some code to parallelize the serial or sequential I/O processing of multicore CPUs (without DataCore's adaptive Parallel I/O, multicore chips still use the sequential I/O processing methods of single or unicore chips). In so doing, it demonstrated that none of the nonsense we were told about legacy storage or Fibre Channel fabrics was true. DataCore achieved the five million IOPS on an FC-connected storage kit using barely 50% of the bandwidth capacity of the link.
NVM Express strikes me as yet another technology advance driven more by engineering for its own sake and vendor market objectives than by any sort of well-defined consumer problem. Flash is getting more cost competitive with disk, and consumers have shown a willingness to throw flash at a performance problem even if it produces only minimal gains in latency or throughput. Maybe that is enough.
About the author:
Jon William Toigo is a 30-year IT veteran, CEO and managing principal of Toigo Partners International, and chairman of the Data Management Institute.
What the NVMe standard is designed to accomplish
What's limiting NVMe adoption?
NVM Express standards will impact PCIe adoption
- SSD: Features, Functions and FAQ –SearchStorage.com
- Pros and Cons of PCI Express SSD –SearchStorage.com
- Essential Guide to Solid-State Storage Implementation –SearchStorage.com
- Best Practices for Deploying SSD –SearchStorage.com