The NVMe protocol is quickly becoming the industry standard for supporting solid-state drives and other non-volatile memory subsystems. NVMe SSD speeds are substantially better than traditional storage protocols, such as SAS and SATA.
The non-volatile memory express standard is based on the NVM Express specification published by the nonprofit group NVM Express Inc. The specification defines a register-level interface that enables host software to communicate with NVM subsystems.
NVMe SSD speeds have an edge on older storage protocols; NVMe streamlines operations and reduces CPU overhead, which results in lower latency and higher IOPS.
What's behind the NVMe protocol
The NVM Express specification defines both a storage protocol and a host controller interface optimized for client and enterprise systems that use SSDs based on PCIe, a serial expansion bus standard that enables computers to attach to peripheral devices.
A PCIe bus can deliver lower latencies and higher transfer rates than older bus technologies, such as the PCI or PCI Extended (PCI-X) standards. With PCIe, each bus has its own dedicated connection, so they don't have to compete for bandwidth.
Expansion slots that adhere to the PCIe standard can scale from one to 32 data transmission lanes, usually offered in groups of one, four, eight, 12, 16 or 32. The more lanes there are, the better the performance -- and the higher the costs. NVMe supports three form factors: add-in PCIe cards, M.2 SSDs and 2.5-inch U.2 SSDs.
The fact that the NVMe protocol uses PCIe to map commands and responses directly through the host's shared memory improves SSD speeds. The protocol is optimized for local use and supports both slot-in storage and external subsystems connected directly to the host computer. NVMe cannot be used with hard disk drives.
The NVMe 1.4 additions and revisions
In 2019, the NVM Express group released the NVMe 1.4 specification. It built on and improved NVMe 1.3, which was released in 2017. NVMe 1.4 introduced several new and improved features. For example, the revised specification defined a PCIe persistent memory region where contents persist across power cycles. It defined a predictable latency mode that enables a well-behaved host to achieve a deterministic read latency.
NVMe 1.4 also made it possible to implement a persistent event log in NVMe subsystems, report asymmetric namespace access characteristics to the host and enable the host to associate an NVM set and I/O submission queue. The revision added a verification command to check the integrity of stored data and metadata, and it defined performance and endurance hints that allow the controller to specify preferred granularities and alignments for write and deallocate operations.
NVMe 1.4 updated many existing features, such as enhancing the host memory buffer and enabling write streams to be shared across multiple hosts. The specification also defined a controller mechanism for communicating namespace allocation granularities to the host and made it possible to prevent deallocation after a sanitize operation.
These are just some of the new and enhanced features in NVMe 1.4. In addition, the revision included details about changes that need to be made to existing NVMe implementations to achieve 1.4 compliance.
NVMe speeds and performance
Over the years, numerous protocols have facilitated connectivity between host software and peripheral drives. Two of the most common protocols are SATA and SAS. The SATA protocol is based on the Advanced Technology Attachment standard, and the SAS protocol is based on the SCSI standard.
The SATA and SAS protocols were developed specifically for HDD devices. Although SAS is generally considered to be faster and more reliable, both protocols can easily handle HDD workloads. If a system runs into storage-related roadblocks, it is often because of the drive itself or other factors, not because of the protocol.
SSDs have changed this equation. Their higher IOPS can quickly overwhelm the older protocols, which causes them to reach their limits before they can take full advantage of the drive's performance capabilities.
NVMe was developed from the ground up specifically for SSDs to improve throughput and IOPS while reducing latency and increasing NVMe SSD speeds overall. Today's NVMe-based drives can theoretically attain throughputs up to 32 GBps, but that's assuming the drives are based on PCIe 4.0 and use 16 PCIe lanes. Today's PCIe 4.0 SSDs tend to be four-lane devices with throughputs closer to 7 GBps. (Each PCIe 4.0 lane can support up to 2 GBps.) Despite some drives reaching only 200,000 IOPS, many are hitting well over 500,000 IOPS, with some reaching as high as 10 million. At the same time, latency rates continue to drop; many drives achieve rates below 20 microseconds (µs) and some below 10.
The older protocols don't perform nearly as well on SSDs. Today's SATA-based drives can attain throughputs of only 6 Gbps, with IOPS topping out at about 100,000. Latencies typically exceed 100 µs, although some newer SSDs can achieve much lower latencies. SAS drives deliver somewhat better performance; they provide throughputs up to 12 Gbps and IOPS averaging between 200,000 and 400,000. Even so, lower IOPS are not unusual. In some cases, SAS latency rates have fallen below 100 µs, but not by much.
That said, metrics that measure NVMe SSD speeds, such as throughput or transfer rate, can vary widely. These figures are trends, rather than absolutes, considering the technology's dynamic nature. Factors such as workload type -- write vs. read or random vs. sequentially -- can make a significant difference in what NVMe maximum speed can be realized. Even so, it's clear that NVMe significantly outperforms the SAS and SATA protocols on every front, especially when used in conjunction with PCIe 4, which makes it possible to double the NVMe bandwidth achievable with PCIe 3.
One reason for this is that NVMe uses a more streamlined command set to process I/O requests, which requires fewer than half the number of CPU instructions as those generated by SATA or SAS. NVMe also has a far more extensive and efficient system for queuing messages. For example, SATA and SAS each support only one I/O queue at a time. The SATA queue can contain up to 32 outstanding commands and the SAS queue can contain up to 256. NVMe can support up to 65,535 queues and up to 64,000 commands per queue.
Demartek Founder Dennis Martin explains NVM Express and NVMe-oF and why data storage administrators need to keep an eye on these protocols.
This queuing mechanism lets NVMe make much better use of the parallel processing capabilities of an SSD, something the other protocols cannot do. In addition, NVMe uses remote direct memory access (RDMA) over the PCIe bus to map I/O commands and responses directly to the host's shared memory. This reduces CPU overhead even further and improves NVMe SSD speeds. As a result, each CPU instruction cycle can support higher IOPS and reduce latencies in the host software stack.
NVMe over fabrics expands out
Despite the advantages that NVMe provides, the protocol is limited to individual hosts with direct-attached NVM subsystems (slot-in or externally cabled). Although this can be useful in some scenarios, many organizations look for distributed systems they can implement in their data centers. For this reason, NVM Express has developed a second specification: NVMe over fabrics (NVMe-oF).
The new standard was published in June 2016 to extend NVMe's benefits across network fabrics such as Ethernet, InfiniBand and Fibre Channel. The organization estimates that 90% of the NVMe-oF specification is the same as the NVMe specification. The primary difference between the two is the way the protocols handle commands and responses between the host and the NVM subsystem.
The NVMe protocol maps the commands and responses to the host's shared memory. The NVMe-oF protocol follows a message-based model to facilitate communications between the NVMe host and the network-connected NVMe storage device. The new protocol extends the distances that NVMe devices can be accessed within the data center, while making it possible to scale out to many devices.
The NVMe-oF specification originally provided two methods for communicating over fabric transports. The first uses RDMA to support connectivity on fabrics such as InfiniBand, RDMA over Converged Ethernet and the Internet Wide Area RDMA Protocol. The second approach is specific to the Fibre Channel transport and includes both the Fibre Channel and Fibre Channel over Ethernet fabrics.
In 2019, NVM Express released the NVMe-oF 1.1 specification, which added support for the TCP transport binding. NVMe over TCP makes it possible to use NVMe-oF on standard Ethernet networks without needing to make hardware or configuration changes. The new standard helps bridge the gap between DAS and SANs, while moving NVMe-oF closer to becoming the de facto standard for enterprise storage.