Storage networks started becoming popular in the late 1990s and early 2000s with the widespread adoption of Fibre...
Channel technology. For those who didn't want the expense of installing dedicated Fibre Channel hardware, the iSCSI protocol provided a credible Ethernet-based alternative a few years later. Both transports rely on the use of SCSI as the storage protocol for communicating between source (initiator) and storage (target). As the storage industry moves to adopt flash as the go-to persistent medium, we're starting to see SCSI performance issues.
This has led to the development of NVMe, or nonvolatile memory express, a new protocol that aims to surpass SCSI and resolve performance problems. Let's take a look at NVMe and how it differs from other protocols. We'll also explore how NVMe over Fabrics changed the storage networking landscape.
How we got here
Storage networking technology is based on the evolution of storage hardware and the need for consolidated and centralized storage. We can trace Fibre Channel's origins back to ESCON on the mainframe, a Fibre-based connection protocol. SCSI, on the other hand, comes from the physical connection of hard drives within servers.
SCSI was originally a parallel communication protocol -- anyone familiar with installing disks into servers will remember ribbon cables. It transitioned to a serial interface with the development of SAS. The PC counterpart was Advanced Host Controller Interface (AHCI), which developed into SATA. You find both protocols on current hard drives and solid-state disks.
Fibre Channel or Ethernet provides the physical connectivity between servers and storage, with SCSI still acting as the high-level storage communication protocol. However, the industry developed SCSI to work with HDDs, where response times are orders of magnitude slower than system memory and processors. As a result, although we may think SSDs are fast, we see serious performance issues with internal ones. Most SATA drives are still based on the SATA 3.0 specification with an interface limit of 6 Gbps and 600 MBps throughput. SAS drives have started to move to SAS 3.0, which offers 12 GBps throughput, but many still use 6 Gbps connectivity.
The issue for both SAS and SATA, however, is the ability to handle concurrent I/O to a single device. Look at the geometry of a hard drive, and it's easy to see that handling multiple concurrent I/O requests ranges from hard to impossible. With some serendipity, the read/write heads may align for multiple requests. And you can use some buffering, but it's not a scalable option. Neither SAS nor SATA were designed to handle multiple I/O queues. AHCI has a single queue with a depth of only 32 commands. SCSI is better and offers a single queue with 128 to 256 commands, depending on the implementation.
Single queues negatively affect latency. As queue size increases, new requests see a greater latency because they must wait behind other requests to be completed. This was less of an issue with hard drives, but a single queue is a big bottleneck with solid-state media where there are no moving parts and individual I/O latency is low.
The industry's answer to the interface problem is NVMe as a replacement for SCSI, both at the device and network levels. Nonvolatile memory express uses the PCIe bus, rather than a dedicated storage bus, to provide greater bandwidth and lower latency connectivity to internally connected disk devices. A PCIe 3.0 four-lane device, for example, has around 4 Gbps of bandwidth per device.
The biggest change in NVMe has been the optimization of the storage protocol. The amount of internal locking needed to serialize I/O access has been reduced, while the efficiency of interrupt handling has increased. In addition, NVMe supports up to 65,535 queues, each with a queue depth of 65,535 entries. So rather than having a single queue, NVMe provides for massive parallelism for the I/O to a connected device. In modern IT environments where so much work is done in parallel -- just think of the number of cores in modern processors -- we can see the benefits of having storage devices capable of processing multiple I/O queues and how this will improve the external I/O throughput.
A higher-level protocol
NVMe is a higher-level protocol, like SCSI, that manages storage requests across either a physical internal bus, such as PCIe, or externally with Fibre Channel or converged Ethernet.
The NVM Express working group, a consortium of about 90 companies, developed the NVMe specification in 2012. Samsung was first to market the next year with an NVMe drive. The working group released version 1.3 of the NVMe specification in July, with features added to security, resource sharing and SSD endurance management issues.
NVMe over Fabrics
If NVMe is a replacement for the storage protocol in device connectivity, it isn't difficult to see that NVMe could also replace SCSI in the iSCSI and Fibre Channel protocols. That's exactly what's happening with the development of fabrics-based NVMe standards, which started in 2014 and published last year.
There are two types of transport in development, NVMe over Fabrics using remote direct memory access (RDMA) and NVMe over Fabrics using Fibre Channel (FC-NVMe).
RDMA enables the transfer of data to and from the application memory of two computers without involving the processor, providing low latency and fast data transfer. RDMA implementations include Infiniband, iWARP and RDMA over Converged Ethernet, or RoCE (pronounced "rocky"). Vendors, such as Mellanox, offer adaptor cards capable of speeds as much as 100 Gbps for both Infiniband and Ethernet, including NVMe over Fabrics offload.
NVMe allows massively parallel access to flash devices, opening up the possibility of fully exploiting the performance of SSDs and, in the future, 3D XPoint. It's a game-changer in terms of performance.
NVMe over Fibre Channel uses current Fibre Channel technology that can be upgraded to support SCSI and NVMe storage transports. This means customers could potentially use the technology they have in place by simply upgrading switches with the appropriate firmware. At the host level, host bus adapters (HBAs) must support NVMe -- typically 16 Gbps or 32 Gbps -- and, obviously, storage devices have to be NVMe over Fabrics capable, too.
As NVMe takes hold in the data center, the most obvious option is to use NVMe devices in servers. Vendors are already starting to bring NVMe-capable servers to market, with physical connector and BIOS support.
Most modern operating systems already support NVMe, as do hypervisor platforms such as VMware vSphere. VMware's vSAN platform has been supporting NVMe devices for more than 18 months.
Another option is to support NVMe as the back-end storage connectivity in a storage appliance. Storage vendors have already made the transition to SAS as a back-end interface, replacing Fibre Channel Arbitrated Loop and parallel SCSI over time.
NVMe will supplant SAS as the main internal protocol for storage arrays. In correctly architected products, this change will result in significant performance improvements as the benefits of flash are unlocked.
Implementing NVMe would deliver fast, low-latency connectivity for flash devices with significant improvement in array performance subject to efficient storage operating system code. To date, we've seen Hewlett Packard Enterprise announce NVMe support for 3PAR, NetApp introduce NVMe as a read cache in FlashCache and Pure Storage provide it for its FlashArray//X platform.
Pure's FlashArray//X with NVMe claims to provide half the latency of the previous generation with twice the write bandwidth. However, these specifications don't include host-based NVMe over Fabrics support, so there's still a potential performance increase to come.
Full adoption of NVMe technology means implementing an entire SAN using NVMe, exactly what NVMe over Fabrics will offer. Prospective customers have the two implementation options described above, with a potential transition to FC-NVMe available for data centers that have the right infrastructure in place. Earlier this year, Cisco announced support for FC-NVMe with its high-end MDS 9710 Fibre Channel director. Brocade already supports NVMe in its Gen6 32 Gbps switches, including the recently announced G610.
Fibre Channel manufacturers claim customers moving to NVMe can avoid having to rip and replace if they have suitable Fibre Channel equipment. That's true for data centers already supporting 32 Gbps connectivity; however, most servers probably aren't using 32 Gbps HBA cards.
When NVMe-capable storage arrays come along, it's possible customers won't have to upgrade to NVMe all at once because SCSI and NVMe can coexist on the same infrastructure. Any hardware savings will depend on the environment. Data centers and IT departments familiar with Fibre Channel from a management and operational perspective may find the transition easier than moving to converged Ethernet, which never really took off because of the expense of replacing hardware.
NVMe over Fabrics can coexist within Fibre Channel Gen 6 technology and onwards. This allows a transition to storage arrays supporting NVMe over Fabrics and an easier transition than the rip-and-replace approach required for a move to NVMe over Ethernet.
The alternative to Fibre Channel is to use NVMe over RDMA and implement a new storage network for slightly higher performance at the expense of scalability. Several vendors offer products using this approach. Startup E8 Storage has developed an NVMe-based storage array that uses 100 Gigabit Ethernet (GbE) converged switches and RDMA network interface cards to implement a high-performance SAN. The company claims as much as 10 million read and 2 million write IOPS with 100 μs (microseconds) read and 40 μs write latency.
Excelero, another startup, has developed what it calls NVMesh, a software-based product that uses a mesh of NVMe-enabled servers to create a distributed compute and storage fabric to implement a range of systems, such as a hyper-converged compute environment. The company has partnered with Micron to produce a platform called SolidScale based on Micron 3.2 TB SSDs and Mellanox Ethernet RoCE switches.
At Pure Storage's Accelerate conference in June, the company announced support for NVMe over Fabrics with Cisco as part of its FlashStack reference architecture. This will include FlashArray//X, Cisco MDS 9700 directors and Cisco UCS, or Unified Computing System, C-Series switches with 32 Gbps HBAs. Pure also announced the ability to support additional shelves on a single controller using NVMe over Fabrics for back-end shelf connectivity.
Yet another startup, Apeiron Data Systems, is developing an NVMe array architecture based on 40 GbE and an externalized hyper-converged design that enables storage and compute to scale independently.
We see NVMe supplanting SCSI and SAS as the default connectivity for SSD devices. High-end deployments will make use of NVMe over Fabrics, where the expense can be justified for the benefit of application performance.
It will be interesting to see NVMe used in existing array platforms, retaining features such as snapshots, replication, compression and deduplication, as well as the adoption of new platform architectures like Excelero and Apeiron, that won't be as feature rich. In the past, this lack of features has kept NVMe-based products from gaining traction. However, over time, surely NVMe will supplant the legacy architectures that survived the move to all-flash but don't allow solid-state to reach its full potential.
Get to know the advantages of RDMA over Fabrics
A look into NVMe's future
Micron makes a move into NVMe over Fabrics