NVMe (non-volatile memory express) is a host controller interface and storage protocol created to accelerate the transfer speed of data between enterprise and client systems and solid-state drives (SSDs) over a computer's high-speed Peripheral Component Interconnect Express (PCIe) bus.
As solid-state technology became the preferred medium in the storage market, it quickly became clear that existing interfaces and protocols -- notably, Serial Advanced Technology Attachment (SATA) and Serial-Attached SCSI (SAS) -- were not suitable, particularly in data center environments. Work on a new protocol designed specifically for NAND flash began as early as 2007, with Intel taking the lead. In early 2011, the initial NVMe spec was released -- nearly 100 tech companies were involved in the development.
The NVMe specification defines a register interface, command set and collection of features for PCIe-based SSDs with the goals of high performance and interoperability across a broad range of NVM subsystems. The NVMe specification does not stipulate the ultimate usage model, such as solid-state storage, main memory, cache memory or backup memory.
NVMe provides an alternative to the Small Computer System Interface (SCSI) standard and the ATA standard for connecting and transmitting data between a host system and a peripheral target storage device. The ATA command set in use with SATA SSDs and the SCSI command set for SAS SSDs were developed at a time when hard disk drives (HDDs) and tape were the primary storage media. NVMe was designed for use with faster media.
The main benefits of NVMe-based PCIe SSDs over SAS-based and SATA-based SSDs are reduced latency in the host software stack, higher input/output operations per second (IOPS), and potentially lower power consumption, depending on the form factor and the number of PCIe lanes in use.
The NVMe protocol can support SSDs that use different types of non-volatile memory, including NAND flash and the 3D XPoint technology developed by Intel and Micron Technology. NVMe reference drivers are available for a variety of operating systems (OSes), including Windows and Linux.
NVMe doesn't just make it possible for existing applications to run faster and more efficiently; it is actually a key enabler of newer and evolving technologies and applications such as the internet of things (IoT), artificial intelligence (AI) and machine learning (ML), which can all benefit from the low latency and high performance of NVMe-attached storage.
How NVMe works
NVMe maps input/output (I/O) commands and responses to shared memory in a host computer over the PCIe interface. The NVMe interface supports parallel I/O with multicore processors to facilitate high throughput and mitigate central processing unit (CPU) bottlenecks.
NVMe offers a more streamlined command set to process an I/O request than the SCSI and ATA command sets do. NVMe requires fewer than half the number of CPU instructions than the SCSI command set does with SAS devices and the ATA command set uses with SATA drives.
NVMe SSDs vs. SATA SSDs
SATA is a communications protocol developed for computers to interact with HDD storage systems. Introduced in 2000 by a group of major tech players, SATA superseded parallel ATA and quickly became the ubiquitous storage system protocol for computers ranging from laptops to servers. Over the years, revisions to the spec have been revved up and it currently runs at 6 Gbps with effective throughput of up to 600 MBps.
Although developed for hard disk technology with mechanical spinning platters and actuator-controlled read/write heads, early SSDs were marketed with SATA interfaces to take advantage of the existing SATA ecosystem. It was a convenient design and helped accelerate SSD adoption, but it wasn't -- and still isn't -- the ideal interface for NAND flash storage devices and was increasingly seen as a system bottleneck.
Designed for flash, NVMe's speed and low latency leave SATA in the dust, and NVMe allows for much higher storage capacities in smaller form factors such as M.2. Generally, NVMe performance parameters outdistance those of SATA by five times or higher.
SATA may be more established with a longer history and lower implementation costs than NVMe, but it's clearly hard disk technology that's been retrofitted to more modern storage media.
NVMe SSDs vs. SAS SSDs
NVMe supports 64,000 commands in a single message queue and a maximum of 65,535 I/O queues. By contrast, a SAS device's queue depth typically supports up to 256 commands and a SATA drive supports up to 32 commands in one queue.
However, NVMe-based PCIe SSDs are currently more expensive than SAS- and SATA-based SSDs of equivalent capacity; although, that delta is narrowing. Also, high-end enterprise NVMe SSDs may consume more power than SAS or SATA SSDs. The SCSI Trade Association claims the more mature SAS SSDs offer additional advantages over NVMe PCIe SSDs, such as greater scalability, hot pluggability and time-tested failover capabilities. NVMe PCIe SSDs also may provide a level of performance that many applications do not require.
History and evolution of NVM Express
The Non-Volatile Memory Host Controller Interface (NVMHCI) Workgroup began to develop the NVMe specification in 2009 and published the 1.0 version on March 1, 2011. The 1.0 specification included the queueing interface, the NVM command set, administration command set and security features.
The NVMHCI Workgroup, commonly known as the NVM Express Workgroup, released an update to the NVMe specification on Oct. 11, 2012. NVMe 1.1 added support for SSDs with multiple PCIe ports to enable multipath I/O and namespace sharing. Other new capabilities included autonomous power state transitions during idle time to reduce energy needs and reservations allowing two or more hosts to coordinate access to a shared namespace to improve fault tolerance.
The NVM Express Workgroup held its first Plugfest in May 2013 to enable companies to test its products' compliance to the NVMe specification and to check interoperability with other NVMe products.
The NVM Express Workgroup incorporated under the NVM Express organization name in March 2014. Founding members at the time included Cisco Systems, Dell, EMC, Western Digital's HGST subsidiary, Intel, LSI, Micron Technology, NetApp, Oracle, PMC-Sierra, Samsung Electronics, SanDisk and Seagate Technology.
The NVM Express organization later became known simply as NVM Express Inc. The nonprofit organization has more than 100 technology companies as members.
On Nov. 17, 2015, the NVM Express organization ratified the 1.0 version of the NVM Express Management Interface (NVMe-MI) to provide an architecture and command set to manage a non-volatile memory subsystem out of band. NVMe-MI enables a management controller to perform tasks such as SSD device and capability discovery, health and temperature monitoring, and nondisruptive firmware updates. Without NVMe-MI, IT managers generally relied on proprietary, vendor-specific management interfaces to enable administration of PCIe SSDs.
NVMe 1.3 feature enhancements
NVM Express released NVMe 1.3 in June 2017. Highlights center on sanitize operations, a new framework known as Directives and virtualization enhancements.
In a sanitize operation, all user data in the NVMe subsystem is modified so that recovery is not possible "from any cache, nonvolatile media or controller memory buffer," according to an NVM Express reference sheet. Sanitize operations are recommended when an SSD is being retired or reused for a new use case. Sanitize modes include low-level block erase on NAND media, crypto-erase to change a media encryption key and overwrite.
The Directives framework defines a mechanism for the exchange of data between a host and an NVMe subsystem. This enables per-I/O command tagging and gives IT administrators the ability to configure reportable attributes and settings.
The first use of Directives is a feature called Streams for optimizing data placement to boost the endurance and performance of NAND SSDs. Traditionally, before new data can be written to the SSD, large blocks of data first must be erased.
The Streams feature enables a host to use a "stream identifier" to indicate the specific logical blocks of storage that belong to a group of associated data. This enables a read or a write to be tagged with related data stored in other locations.
Virtualization enhancements define how NVMe flash could be used in a shared storage environment where both physical and virtual controllers are present, including primary storage controllers and secondary storage controllers. NVM Express said the goal is to enable development teams to dedicate a specific SSD to a specific virtual machine (VM).
NVMe 1.4 feature enhancements
NVMe 1.4 was introduced in July 2019. This latest version includes a number of enhancements and new features, including:
- Rebuild Assist improves data recovery and enhances data migration operations.
- Persistent Event Log maintains a detailed drive history that can be used for debugging and determining the causes of problems.
- NVM Sets and IO Determinism improve performance and quality of service (QoS).
- Asymmetric Namespace Access (ANA) enhances multipathing for high availability.
- Host Memory Buffer (HMB) reduces latency and aids in SSD design.
- Persistent Memory Region (PMR) allows host systems to read and write directly to the dynamic random access memory (DRAM) that SSDs include along with their core flash, which had been used primarily for caching
The new features will require flash drive manufacturers to upgrade their products to incorporate the enhancements. New drivers will also be required for OSes.
NVMe form factors and standards
The need for a storage interface and protocol to better exploit NAND flash's performance potential in enterprise environments was the principal impetus behind the development of the NVMe spec. But reimagining the connection standard opened the doors to several different types of interface implementations that could stay within the bounds of the new spec while offering a variety of implementation options.
In short order, a number of flash form factors conforming to NVMe specifications emerged, including conventional-type add-in cards (AIC) for the PCIe bus, and new form factors for SSDs dubbed M.2 and U.2.
- AIC. The AIC form factor allows manufacturers to create their own cards that slot into the PCIe bus without worrying about storage bay designs or similar limitations. The cards are often designed for special use cases and may include additional processors and other chips to enhance the performance of the solid-state storage.
- M.2. The M.2 form factor was developed to take advantage of NAND flash's compact size and low heat discharge. As such, M.2 devices aren't intended to fit into traditional drive bay compartments, but rather to be deployed in much smaller spaces. Often described as about the size of a stick of gum, M.2 SSDs measure 22mm wide and generally 80mm long, although some products may be longer or shorter.
- U.2. Unlike the M.2 form factor, U.2 SSDs were designed to fit into existing storage bays originally intended for standard SATA or SAS devices. U.2 SSDs look very much like those older media, as they typically use the 2.5-inch or 3.5-inch enclosures that are familiar housings for HDDs. The idea, of course, was to make it as easy as possible to implement NVMe technology with as little reengineering as possible.
Another, less widely deployed NVMe form factor is enterprise and data center SSD form factor, or EDSFF. It's backed by key storage industry players, like Intel, Dell EMC, Hewlett Packard Enterprise (HPE), Lenovo, Samsung and others. The goal of EDSFF is to bring higher performance and capacities to enterprise-class storage systems. Perhaps the best-known example of EDSFF flash are Intel's E1.L (long) and E1.S (short) flash devices, which are provided in what was originally referred to as the "ruler" form factor.
NVMe over Fabrics
NVM Express Inc. published the 1.0 version of the NVMe over Fabrics (NVMe-oF) specification on June 5, 2016. NVMe-oF is designed to extend the high-performance and low-latency benefits of NVMe across network fabrics that connect servers and storage systems, such as Fibre Channel (FC), Ethernet and InfiniBand.
Fabric transports include NVM-oF using remote direct memory access (RDMA) and NVMe-oF mapped to FC. A technical subgroup of NVM Express Inc. worked on NVMe-oF with RDMA, and the T11 committee of the International Committee for Information Technology Standards (INCITS) is responsible for the development of NVMe over FC (FC-NVMe).
The NVMe-oF specification is largely the same as the NVMe specification. One of the main differences between NVMe-oF and NVMe is the methodology for transmitting and receiving commands and responses. NVMe is designed for local use and maps commands and responses to a computer's shared memory via PCIe. By contrast, NVMe over Fabrics employs a message-based system to communicate between the host computer and target storage device.
The stated design goal for NVMe-oF was to add no more than 10 microseconds of latency for communication between an NVMe host computer and a network-connected NVMe storage device, in comparison to the latency associated with an NVMe storage device using a local computer's PCIe bus.