BACKGROUND IMAGE: traffic_analyzer/iStock


How NVMe technology and NVMe over Fabrics will change storage

Get started Bring yourself up to speed with our introductory content.

NVMe SSD speeds explained

The nonvolatile memory express protocol is tailor-made to make solid-state drives as fast as possible. Get up to speed on everything you ever wanted to know about NVMe and PCIe.

The NVMe protocol is quickly becoming the industry standard for supporting solid-state drives and other nonvolatile memory subsystems. NVMe SSD speeds are substantially better than traditional storage protocols, such as SAS and SATA.

The nonvolatile memory express (NVMe) standard is based on the NVM Express specification published by the nonprofit group NVM Express Inc. The specification defines a register-level interface that enables host software to communicate with NVM subsystems.

NVMe SSD speeds have an edge on older storage protocols; NVMe streamlines operations and reduces CPU overhead, which results in lower latency and higher IOPS.

Introducing NVM Express

The NVM Express specification defines both a storage protocol and a host controller interface optimized for client and enterprise systems that use SSDs based on PCI Express (PCIe), a serial expansion bus standard that enables computers to attach to peripheral devices.

A PCIe bus can deliver lower latencies and higher transfer rates than older bus technologies, such as the PCI or PCI Extended (PCI-X) standards. With PCIe, each bus has its own dedicated connection, so they don't have to compete for bandwidth.

Expansion slots that adhere to the PCIe standard can scale from one to 32 data transmission lanes, usually offered in groups of 1, 4, 8, 12, 16 or 32. The more lanes there are, the better the performance -- and the higher the costs. NVMe currently supports three form factors: add-in PCIe cards, M.2 SSDs and 2.5-inch U.2 SSDs.

NVM Express timeline

NVMe SSD speeds are improved by the fact that the protocol uses PCIe to map commands and responses directly through the host's shared memory. The protocol is optimized for local use and supports both slot-in storage and external subsystems connected directly to the host computer. NVMe cannot be used with hard disk drives.

NVMe performance benefits

Over the years, numerous protocols have facilitated connectivity between host software and peripheral drives. Two of the most common protocols are SATA and SAS. The SATA protocol is based on the Advanced Technology Attachment standard, and the SAS protocol is based on the Small Computer System Interface standard.

The SATA and SAS protocols were developed specifically for HDD devices. Although SAS is generally considered to be faster and more reliable, both protocols can handle HDD workloads easily. If a system runs into storage-related roadblocks, it is often because of the drive itself or other factors, not because of the protocol.

SSDs have changed this equation. Their higher IOPS can quickly overwhelm the older protocols, which causes them to reach their limits before they can take full advantage of the drive's performance capabilities.

SATA, SAS and NVM Express drive backplanes
Diagram of the backplanes used for SATA, SAS and NVM Express -- SSD form factor -- drives

NVMe was developed from the ground up specifically for SSDs to improve throughput and IOPS while reducing latency and increasing NVMe SSD speeds overall. Today's NVMe-based drives attain throughputs up to 16 gigabytes per second (GBps), and vendors are pushing for 32 GBps or higher. Many NVMe-based drives reach well over 500,000 IOPS. Some deliver 1.5 million, 2 million or even 10 million IOPS. At the same time, latency rates continue to drop; many drives achieve rates below 20 microseconds, and some below 10.

The older protocols don't perform nearly as well on SSDs. Today's SATA-based drives attain throughputs of only 6 Gbps and IOPS that top out at about 100,000. Latencies easily exceed 100 microseconds. SAS drives deliver somewhat better performance; they provide throughputs up to 12 Gbps and IOPS averaging between 200,000 and 400,000. In some cases, SAS latency rates have fallen below 100 microseconds, but not by much.

That said, metrics that measure NVMe SSD speeds can vary widely. These figures are trends rather than absolutes in light of the technology's dynamic nature. Even so, it's clear that NVMe significantly outperforms the other protocols on every front. One reason for this is that NVMe uses a more streamlined command set to process I/O requests, which requires fewer than half the number of CPU instructions as those generated by SATA or SAS.

NVMe also has a far more robust and efficient system for queuing messages. For example, SATA and SAS each support only one I/O queue at a time. The SATA queue can contain up to 32 outstanding commands, and the SAS queue can contain up to 256. NVMe can support up to 65,535 queues and up to 64,000 commands per queue.

Demartek founder Dennis Martin explains NVM Express and NVMe over Fabrics and why data storage administrators need to keep an eye on these protocols.

This queuing mechanism lets NVMe make much better use of the parallel processing capabilities of an SSD, something the other protocols cannot do. In addition, NVMe uses remote direct memory access (RDMA) over the PCIe bus to map I/O commands and responses directly to the host's shared memory. This reduces CPU overhead even further and improves NVMe SSD speeds. As a result, each CPU instruction cycle can support higher IOPS and reduce latencies in the host software stack.

Introducing NVMe over Fabrics

Despite the advantages that NVMe provides, the protocol is limited to individual hosts with direct-attached NVM subsystems (slot-in or externally cabled). Although this can be useful in some scenarios, many organizations look for distributed systems they can implement in their data centers. For this reason, NVM Express has developed a second specification: NVM Express over Fabrics (NVMe-oF).

The new standard was published in June 2016 to extend NVMe's benefits across network fabrics such as Ethernet, InfiniBand and Fibre Channel. The organization estimates that 90% of the NVM Express over Fabrics specification is the same as the NVM Express specification. The primary difference between the two is the way the protocols handle commands and responses between the host and the NVM subsystem.

NVMe-oF diagram

The NVMe protocol maps the commands and responses to the host's shared memory. The NVMe-oF protocol follows a message-based model to facilitate communications between the NVMe host and the network-connected NVMe storage device. The new protocol extends the distances that NVMe devices can be accessed within the data center, while making it possible to scale out to a large number of devices.

The NVMe-oF specification currently provides two methods to facilitate communications over fabric transports. The first uses RDMA to support connectivity over fabrics such as InfiniBand, RDMA over Converged Ethernet and the Internet Wide Area RDMA Protocol. The second approach is concerned specifically with the Fibre Channel transport. At some point, this effort will also include Fibre Channel over Ethernet.

Dig Deeper on NVMe storage

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

How will NVMe-oF, NVMe over Fibre Channel and NVMe over Ethernet, affect SSD speeds?
I think it will be good if we can get the facts and figures for NVMe-oF for Infiniband, iWARP, RoCE, FC. Also, do we bypass the kernel, when memory is shared?
"16 gigabytes per second (GBps)" This should be 16 Giga bits/s.
Some remarks on:
For example, SATA and SAS each support only one I/O queue at a time. The SATA queue can contain up to 32 outstanding commands, and the SAS queue can contain up to 256. NVMe can support up to 65,535 queues and up to 64,000 commands per queue.
in fact SCSI-3 tag queue uses 64-bit for queing allowin 2^64 queue depth.  source: In fact if we consider high performance, low latancy storage - lowering queue depth lowers latencies  -  queue depth is  welcome only on metropolitan, WAN class fabric (to improve async replication bandwidth over high latency fabric),  than production workload.