NVMe and NVMe over fabrics promise to bring dramatic changes to data center storage infrastructures by enabling increased application density and the use of entirely new applications like machine learning and high-velocity analytics. Upgrading to storage based on the NVMe protocol requires careful consideration and planning so an organization can maximize its capabilities and not waste money.
When flash first came to market, manufacturers designed SSDs to look and act like SAS and SATA drives. This meant performance compromises were made to ensure compatibility and easy adoption. As the flash market evolved, a few vendors introduced PCIe-based flash with proprietary drivers that worked around the SCSI protocol. The problem was that these offerings were incompatible with other types of SSDs, heading the data center toward a massively fragmented storage environment.
Enter the standards-based NVMe protocol designed specifically for memory-based storage. A replacement for traditional SCSI protocols like SAS and SATA, the most apparent difference is how an NVMe-based drive connects to the server or storage system. Instead of connecting to a SCSI adapter, NVMe connects directly to the PCIe bus, providing storage with direct access to the CPU. NVMe provides a significantly higher queue depth of 64,000 -- up from just one in SCSI -- as well as 64,000 commands in each of those queues -- up from a mere 32 commands for SCSI's single queue.
What is NVMe-oF?
The NVMe protocol cures many of the performance problems associated with flash I/O bottlenecks, but sharing an NVMe-based system over Fibre Channel (FC) or IP means reintroducing SCSI into the I/O path. NVMe-oF solves this problem by enabling a storage network to use NVMe in a networked fashion. In theory, a storage environment that is end-to-end NVMe -- meaning there are NVMe network adapters in the application server, NVMe network adapters and NVMe drives in the storage system -- should deliver the same performance and low latency as direct-attached NVMe drives. As NVMe-oF becomes ubiquitous, DAS configurations will become almost unnecessary.
The good news is that much of the work on NVMe-oF is complete. Most network switches shipped in the last two years should be firmware upgradable to NVMe-oF. In addition, both FC and IP are supported. Currently, FC has the advantage in that it supports both NVMe and traditional FC-based SCSI through the same network switch.
How is NVMe delivered today?
The most common form of NVMe is NVMe drives installed in the server. Most midrange to high-end servers -- even most high-end laptops -- have NVMe drives built in or available for expansion. NVMe all-flash storage systems are also available.
The majority of all-flash storage systems today support only traditional FC or IP connections to the storage network, not NVMe-oF. They still deliver a significant performance increase, but not as optimal as an end-to-end offering. Most systems are not advertised as being upgradable to NVMe, so once customers make their selections, they should understand they will always be in this mixed configuration.
There are a few end-to-end NVMe protocol offerings on the market. Most of these use an IP-based NVMe-oF configuration and have a relativity limited number of options in terms of support. A few end-to-end NVM-oF products are more open but require FC, an extra investment if the organization doesn't have it.
When should you move to NVMe?
Justifying an NVMe investment based on performance requirements generally requires an organization to generate 300,000 to 700,000 IOPS on a relatively consistent basis. For traditional workloads, like virtual environments and bare-metal database applications, an organization would need to create denser application infrastructures. For virtual machines, this likely means doubling, if not tripling, the number of VMs per physical server. Most servers have plenty of CPU power, but are either storage or RAM constrained. NVMe not only helps with storage I/O, but helps to ease RAM constraints because it provides near-RAM performance on virtual memory swap pools.
A hyper-converged infrastructure should benefit even further from using NVMe storage in cluster nodes, especially if the HCI storage software has the ability to position each VM's primary data store in the same server as the VM. Although write I/O likely still replicates across the network for data protection, the VM gains full NVMe performance for read I/O.
Most organizations currently don't need end-to-end NVMe. Some exceptions are AI, machine learning, deep learning and high-velocity analytics workloads, which often involve scanning millions, if not billions of small files, so the speed and latency of the storage device matters. The performance of these workloads, especially as they scale and go into production, matters.
Part of the decision on when an organization moves to the technology will be made for them by vendors. As the price delta continues to narrow between complete NVMe-based arrays and SAS-based arrays, the natural inclination will be to purchase an NVMe system even if the performance is not yet needed.
Addressing the NVMe performance bottleneck issue
By today's standards, NVMe does not have a performance bottleneck. It does, however, expose bottlenecks elsewhere because the media itself is so low in latency. The NVMe protocol exposes components, specifically the software, used to hiding behind media latency in the environment. Storage vendors have taken three approaches to try to improve storage software performance:
- Keep software basically the same, but combine it with more powerful processors. The problem is that the standard Intel processors driving most of these software offerings have improved performance by increasing the number of cores, not the performance of each core. Unless the software efficiently multithreads across these processors, including more cores has a diminishing return as the core count increases.
- Turn software into hardware by using field-programmable gate arrays (FPGAs) or even custom silicon. Turning software into hardware enables storage services to run on dedicated hardware and processing. The FPGA or silicon approach adds cost versus using off-the-shelf Intel CPUs. It also makes software upgrades more difficult, and an organization will need to periodically reprogram the FPGAs in the storage system.
- Rewrite software from the ground up to take full advantage of various changes in hardware. Rewriting starts with creating truly parallel threads that can stripe across cores instead of being dedicated to one core. A rewrite should go further by also rewriting the algorithms for data protection, data placement, snapshots and volume management. While rewriting software is probably the best long-term solution, it is also the most difficult and time-consuming approach.
Does NVMe cost more?
NVMe drives are quickly reaching price parity with SAS flash drives. But to get the full performance benefits from these drives requires the ecosystem surrounding NVMe to be more powerful and, therefore, more expensive than the ecosystem surrounding a SAS-based all-flash array. Typically, an NVMe-based storage system will have much more powerful processors and more sophisticated motherboards with additional PCIe lanes and RAM. All of these components can add significant cost to an NVMe investment. If an organization decides to use an end-to-end NVMe protocol architecture, new network adaptors may also be needed.
Market transition to NVMe
Within the next three years, most systems on the market will be based on the NVMe protocol, and the cost differential between a midrange SAS all-flash array and an NVMe array will be negligible. The transition will start by connecting an NVMe array to the existing legacy network. Networks tend to refresh at a slower pace than storage, so connecting NVMe arrays into a SCSI-based network works with that transition networking speed. That network will gradually change to the NVMe protocol over the next five to six years, by the time of the second storage refresh, which means the array at that point will be NVMe internally and externally.
Two types of organizations need to transition to NVMe sooner: those with a legitimate need for the performance boost and those that need to refresh storage within the next two years. If full NVMe performance is required right now, consider an end-to-end offering despite the protocol's newness. Here, speeding the time to results is worth any potential risk in turning to startup storage vendors (see "Understanding NVMe market vendor participation") and new networking infrastructure.
Understanding NVMe market vendor participation
Vendors participate in the NVMe protocol market at several levels based on their capabilities.
The first level is suppliers -- including Intel, Kingston Technology, Micron Technology, Samsung, Seagate Technology, Toshiba and Western Digital Corp. -- who deliver the NVMe drive or media itself. These vendors differentiate themselves by performance, consistency of performance under load, programmability for specific workloads and density (number of terabytes per drive).
Most enterprises won't deal directly with drive-level decisions, as the next class of vendors, NVMe array providers, will make those choices for them. NVMe array providers fall into one of the following two categories:
- Existing storage array vendors. These include DataDirect Networks (Tintri), Dell EMC, Hewlett Packard Enterprise, NetApp, Pure Storage, Western Digital (Tegile) and many others that actively participated in the all-flash array market when it was built solely on SAS-based flash. Currently, most of these vendors have released an NVMe version of their products, and, in most cases, the unit is a refresh of their current offering but with PCIe-connected NVMe drives. A few have changed their storage software to take advantage of the enhanced command count and queue depth of NVMe. Others, like Kaminario, use NVMe-oF to network their back-end storage nodes with front-end storage controllers.
- Startups. These include vendors like Apeiron Data Systems, E8 Storage, Excelero, Pavillion Data Systems and Vexata that deliver end-to-end NVMe storage. Most of these offerings are IP-based NVMe-oF, but a few support NVMe-oF on Fibre Channel.
For organizations that have a SAS flash array and need to upgrade, serious consideration should be given to vendors that can deliver an NVMe array with standard connectivity to the network. These are available from established vendors and require no changes to the network.
NVMe and NVMe-oF represent an important step forward for memory-based storage, and the new protocol is being rapidly adopted by the industry. Many organizations are not pushing their SAS all-flash arrays to their limits, so time is on the side of IT professionals. Unless you have workloads that can fully exploit NVMe architecture, you should take the time required to understand the protocol and create a roadmap for your organization.
- NVMe Storage: A Comprehensive Introduction –Western Digital
- NVMe flash storage 101 –ComputerWeekly.com
- NVMe-oF: Changing the Game for Storage –Dell EMC and Intel®
- Prepare Your Infrastructure for the Future with NVMe Storage –Western Digital