Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

NVMe at Scale: A Radical New Approach to Improve Performance and Utilization

Here is your introduction on how apps can transform by making the right infrastructure choices that enable NVMe to scale, with minimum limitations, HPC and next-gen HCI use cases and more.

00:06 Nishant Lodha: Hello, and welcome to virtual Flash Memory Summit 2020. My name is Nishant Lodha and with me is Shahar Noy. Here and today we are going to talk to you guys about an interesting new concept about a relatively older topic, which is doing NVMe at scale, but using a very radically different approach that looks at how you can improve utilization and efficiency when deploying NVMe at scale.

Before we jump into the topic itself, a quick introduction. Like I said, my name is Nishant Lodha. I am a technologist for Marvell, focusing on connectivity solutions. Shahar is the handsome variant of me. He's the technologist in the storage solution, so think of us as two ends of the wire. While I look at connectivity and connectivity to storage, Shahar focuses his time on making efficient, reliable, connected storage. Together, both of us, we call ourselves visionaries, but also enablers. So, we not only peek into the future, we also work on products; bring products to market; work with customers, partners, the community to make our customers successful. With that, let's talk about the agenda today.

01:29 NL: What we want to start off with is let's talk about how NVMe is deployed today. How is NVMe deployed at scale today, and can it be -- using existing paradigms -- be deployed without complexity, with efficiency? I want to then start off by giving some color about a new paradigm of technologies, problem statements, deployment scenarios, which are present in data centers today or are coming shortly into them. Both cloud, as well as enterprise data centers, and with these new paradigms, are coming new challenges. We talk about those paradigms, those challenges.

And finally, the reason me and Shahar wanted to talk to all of you guys about is we are introducing a radical new approach, a completely different approach that leverages our past experiences with NVMe at scale, but gives it an entire new perspective on how you view things, how applications and customers consume storage, connect to storage and do that efficiently at scale and with ease.

02:42 NL: Shahar will come back then and give you more color on use cases and his use cases you will see things that are relevant today, for example, things around software-defined storage and machine learning, deep learning. He will also talk about some interesting upcoming use cases where this new and radical approach can significantly bring efficiencies and performance and scale and things like that. And, finally, Shahar will talk to you about what's the view from the summit.

Once we have done all of this, we've achieved all of this, and when we go back, sit back and look at all the work that we have done, how does the world look like? So, hang in there while we take you through that story. When you think of NVMe or non-volatile memory over PCI Express, what is the first thing that comes to your mind? You think of it as performance, as the next generation of storage, it's all about performance.

03:40 NL: Similarly, when you think of NVMe over Fabrics, you think of scale, you think of scalability, you think about reaching and accessing NVMe irrespective of where it resides -- far away, remote in some external storage and some other consumable object or things like that. But increasingly customers are demanding and asking that, "Is my scale that I'm getting with NVMe over Fabrics efficient?" They're asking the question whether this efficiency that people claim today is real. And they are asking for efficiencies, not just in terms of performance; they're asking in terms of capacity utilization, they're asking an overall cost of ownership, the cost of individual I/O and all of that. And it's important to note that irrespective of where the applications reside, whether it's sitting on-prem, inside the cloud or even someday inside a connected car, they require efficient access to NVMe and efficiency at scale.

04:44 NL: So, with that, let's peer into some market data here. On the right-hand side, in general, I think it's popular knowledge now that bringing NVMe price points have declined significantly, and by many estimates they are already close to what SaaS- and SATA-based SSDs used to be. And it is not just the price declines that have brought about accelerated adoption of NVMe, it has been the applications, the demand for performance, the demand for response times from applications and cloud and web-scale application that have driven NVMe to new heights.

Let's look at the chart here on the right-hand side, and it plots two lines on it. One maps how NVMe -- or here they are called as PCIe SSDs -- is the attach rate on servers or more specifically inside servers, and how does that attach rate look for storage? And think of that as external dedicated storage boxes, appliance, storage arrays, all flash arrays, whatever you call that.

05:55 NL: A couple of key takeaways from the chart here. First thing is that you definitely see much larger, much more significant percentage-wise adoption of NVMe PCIe SSDs inside servers. But having said that, there is projected significant growth, which is supposed to happen over the next several years, even in terms of external storage consuming NVMe. And both me and Shahar will give a little bit more color that there is a world between the server and the storage, where PCIe, NVMe SSDs are not just captive to servers, neither are they sitting far away in external storage, but a new concept we'll introduce to you shortly in that.

06:48 NL: But the key takeaways is that whether it is PCIe SSD sitting inside the server or hanging off the server via some way or connected to a server via external storage, customers are demanding that they provide that . . . Vendors and developers and engineers and technology leaders provide customers with solutions that are efficient and efficient at scale, and that has been a significant market to monitor. And all of that has been primarily been relying on NVMe over Fabrics. And NVMe over Fabrics, a lot of you guys know, provides a highly efficient as well as a scalable method to go and get access to NVMe irrespective of where it resides. My conversation here is not just about NVMe over Fabrics, but now how there is this trend within NVMe over Fabrics, the trend in which how storage is built and consumed in this new paradigm that the next generation of applications will meet.

07:54 NL: Let's talk about the fabrics itself, right? Efficiently scaling out NVMe all across your data center requires, in essence, two big things. You definitely need a highly performance-optimized network, a network that understands, that should be aware that there is NVMe traversing through that network. We want a network that is storage-aware, that's not just oblivious to what goes inside a frame or packet that traverses through that network. We also require a end-to-end NVMe stack and . . . Hold your thought on that. A little bit more on that later.

And lot of people will talk to you about end-to-end NVMe stack, but if you look a little bit deeper into today's architectures, you will see that end-to-end NVMe stack is not really end to end. There could be multiple, different protocols taking NVMe and translating NVMe as they traverse from server to storage as articulated on this right-hand side chart. There are a couple of different fabrics protocols that everybody talks about. There is definitely RoCE v2 and TCP, and then there is for the enterprise business-critical applications, there is Fibre Channel.

09:08 NL: Today, most of our conversations, we want to focus on the Ethernet-based transports, both RDMA, RoCE as well as TCP. So, with that, if you look at efficient NVMe over Fabrics connectivity, what are the few things, what are the essence, what are the key ingredients that you need to build a successful deployment of NVMe over Fabrics?

Going from left to right here, it all starts with server connectivity. You need a end-to-end NVMe protocol that starts at the host to complete native support, all the way from the application through the virtual machine, through the stack, through the RDMA- or TCP-enabled neck, and you need efficient management and deployment of these protocols top to down. Because, remember, a technology can be very powerful; but if it is too hard to deploy, too difficult to manage, it cannot be deployed in data centers all around. It then gets limited to specialized application . . . Acting as not just a glue that connect servers and storage, but also a infrastructure that is storage-aware, that understands NVMe, that understands NVMe over Fabrics, and provide a low-latency boundless infrastructure so that NVMe can scale without limitations.

10:28 NL: On the storage side of things, Shahar will come back here and talk about it in more detail. But if you look at it from the storage perspective, the same protocols that the server is using to communicate with NVMe need to be end to end. Which means we expect absolutely no translations till the way a packet or frame reaches a storage media. And not only that, we require seamless end-to-end connectivity. There are this need for choices, there is no one single NVMe over Fabrics protocol, RDMA or non-RDMA, that will serve every different use case, and it is imperative to understand that any storage, networking or server connectivity solution that you need must provide choices.

11:14 NL: And all in all, if you look at . . . If you want to make sure that NVMe at scale is successful, is deployable, is manageable, and can deliver to the scale and efficiencies that customers in cloud or data centers require, it is absolutely essential that you start thinking about NVMe at scale differently. Look at it with the technologies that me and Shahar are going to talk about with you guys here, and there is some amount of disruption required in order to do NVMe over Fabrics right. It is implemental, but it requires a different way of thinking. So, there are two big fabrics that I said are important to customers. If you're looking to deploy with the most simple options and deploy over existing Ethernet infrastructures, NVMe over TCP is the absolute right choice, but having said that, it is very important to understand that whatever solutions you choose for NVMe over TCP, you need them accelerated because without acceleration, the success of your solution with NVMe over TCP might be limited.

12:21 NL: So, look at solutions that can do offloads and bring value by accelerating that fabric, so that you can meet the demands of NVMe. Similar is the thing with NVMe over RDMA. Excellent performance, low latency, but you require the skill set or a controlled environment to make sure that deployment can be successful. So, again, but you need understand . . . It is important to understand that customers will need choices, customers will need acceleration, whether it's offloading NVMe over TCP or providing a simple, easy-to-use end-to-end RDMA stack.

13:04 NL: OK, so what is this approach that I have been talking to you guys about? So, if you look at it, there are couple of building blocks of this approach. If you look at server connectivity itself, it starts with choices, which means any networking solution or NICs that you would look at would need to support RDMA or different types of RDMA. Because RoCE, iWARP, and then finally NVMe over TCP, but it is important to note that if you want to be successful, you need choices and you need acceleration. And NVMe over TCP can solve a lot of problems of simplicity, deployment and management, but it requires you . . . If you want to be successful with that, it requires offload and acceleration. The networking side of things, their technologies . . . Again, Shahar will talk about that, things like Safe, which is storage of air-flow control engine, which allow networking to be aware of NVMe, configurable to the demands of the customers.

14:01 NL: And, finally, on the storage side, this innovative new concept of a Ethernet SSD, which is, you take NVMe and give it an IP address, put it on the Ethernet LAN so that it is directly accessible and addressable by applications. And that allows a unrestricted pipe from the application to the actual media, which provides you efficiency, performance, scale. But things the solutions don't end with a simple concept like Ethernet SSDs. There are concepts, things like a Marvell Ethernet bunch of flash technology or EBOF technology that we'll graphically depict to you, that allows all of this to come together and provide a single solution that serves multiple needs.

14:50 NL: OK, just a minute or so before I hand this off to Shahar. If you look at . . . I talked about NVMe over TCP acceleration. And enough I have been already talked about NVMe over RDMA, so I want to give a little bit more color to NVMe over TCP. If you are able to offload NVMe over TCP, whether it's to a NIC or to a regular NIC like the Marvell FastLinQ or some kind of smart NIC or programmable NIC, that solution has the potential to deliver performance at par with RDMA. And it could be much, much easier to deploy, and when things are easy to deploy and manage, they are successful, and that's the simplicity of NVMe over TCP. It can work on any scale, work on any network, without requiring special skill set. But it's important that when you look at TCP stacks, robustness is very important. For example, the stack that is there in Marvell 10/25/50/100 Gb FastLinQ NICs, it has over 10 plus years of hardened TCP offload stack, things that we have used and deployed in data centers for the last decade.

15:56 NL: And finally, based on our own internal measurements, we have seen that if you decide to run software NVMe over TCP, there is significant cost of I/O. There is significant amount of CPU cycles that you need to burn and that customers are unwilling to do. Cloud customers or on-prem customers want to monetize their CPU, and they want offload solutions that will help them bring their cost of I/O down and with one such benchmark, it indicates that for about 400,000 IOPS, you can save over a $1,000 of your CPU cost and eventually bringing scale and efficiency into your data center. Because that is what exactly that is required to make this solution successful. With that, I'll pass this on to Shahar and Shahar will talk to you a little bit more about the networking glue, and finally, this innovative storage box that at Marvell we call the Ethernet bunch of flash, and together how all of these solutions come together to solve today's problems, from deep learning to software-defined storage, some interesting new concepts. So, thank you, guys. Shahar, the floor is all yours.

17:10 Shahar Noy: Thank you, Nishant, for the introduction and highlighting why NVMe over Fabric is better through TCP acceleration on the initiator side or what we like sometimes to call the server connectivity side. Let's move now to the next pillar, the networking switch.

We mentioned that networking switch is everywhere, and those devices are ubiquitous, but what is required from the Ethernet switch to be able to support a scalability and the efficiency of NVMe over Fabric? So, let's focus today, when we talk in the context of Ethernet switch, on the storage-aware capabilities, what is expected from the Ethernet switch to be able to support storage in a more friendly way. First and foremost, we need to talk about the performance of the Ethernet switch because if you think about today, we have Gen3x4 drives. We have typically 24 of those in a shelf. If we multiply the performance by 24 drives, we are roughly around 600 gigabit per second of performance from the NVMe drives in the shelf. If you want to expose all of it upstream, we need another 600 gigabit of connectivity. So, at minimum, we need 1.2 terabits of switches today.

18:32 SN: As we go forward and we move to Gen4 drives, we move to EDSFF that you can potentially put more than 24 drives in the shelf, we actually need 3x more of the performance. What does it mean? You need 3.6 terabit of the switch. If, God forbid, you want to put 100 SSDs in a shelf, like for your clients, you need 12.8 terabits of the switch. So, all this require the switch architecture to evolve into what we call slices of terabit or tiles of terabit, so you can pick and choose what type of switches you need in your network.

The second aspect of the switch is to be storage-aware. Remember, we're connecting NVMe drives behind it. We need the switch to be able to discover and expose those SSDs to the network, we need flow control. Imagine putting a mix of Optane SSDs and TLC SSDs. We might want to prioritize the Optane drives because they have better latency, so we need to somehow direct the traffic into very specific ports. By the way, this is something that's extremely difficult for a PCIe switch to do today, but Ethernet switch with storage awareness, have the capabilities of doing so. And lastly is the telemetry and diagnostic, and it's not only to understand the networking challenges that you have in your setup or maybe port issues, but also how do we take the NVMe smart commands through this diagnostic and telemetry and expose it up to the network.

20:04 SN: The third element is robustness. The customers we talk to, the prospects we talk to, they expect Ethernet to be as robust as PCIe. Remember, the PC architecture has been there for a while, more than a decade, it was hardened. Now, what do we do with Ethernet to guarantee the same level of robustness? We like to call it congestion management and control, but in some ways it's how we enable furnace between the ports to balance the traffic, how we prevent long-tail latency. All of this is something you need to consider when you look into an Ethernet switch to improve on NVMe-oF performance.

And lastly, which is always the major topic of discussion, is how do we make it cost-competitive down the road? The notion in the market is that Ethernet port is more expensive than PCIe port, and this is true. But if you look into how you put all of this together, if you put an Ethernet switch versus a PCIe-based switch solution, the PCIe switch-based solution has compute attach, DRAM attach, NIC attach, all of these adds to the total cost of the system. So, when you compare the scalability of NVMe over Fabric going forward, you need to take those components into consideration as well. And what we see is that the actual Ethernet-based solution is more cost effective per gigabyte of performance.

21:38 SN: So, all of those four key variables need to be considered as you start your quest into, "Hey, how do I make my NVMe-oF more efficient and more scalable?" Now, let's move into the third pillar, which is storage. So, Nishant covered the server chronic TBT with TCP acceleration. We just touched upon the storage-aware needs in your networking switch, we will cover more of this in a couple of . . . In the use cases we have going forward, but now let's understand what is the additional critical piece to make all of it work, and this is the storage piece all together with Ethernet SSD. Those Ethernet SSDs need to support multifabric, but today we'll focus more on how we enable, what is required to enable Ethernet SSD. Now many, many customers ask us to this day, "Hey, why even considering Ethernet SSDs, or why even considering EBOF?" which is Ethernet bunch of flash.

22:39 SN: If you look into the typical architecture of your data center for scale-out storage, once again, we talk about efficiency at scale. There's typically an application server, a storage controller, and all of those are connected through some sort of switch, some sort of a network into array of flash. Today, we call it NVMe-oF. And what is the most basic building block when you build this NVMe-oF box? It's built upon the traditional JBOF architecture. Now, JBOF is a legacy of JBOD and it was designed predominantly for SATA and SAS HDDs, and later on was converted to support the first flash SATA SSDs that came to the market. Today, the same architecture is being used to connect NVMe SSDs.

23:30 SN: Now, while this architecture is beautiful and it was created a decade ago to support the demand for more storage through HDDs, and later on SATA SSDs, it has some critical bottlenecks here that prevent this architecture to scale into the NVMe space. What you see here, the first bottleneck is the actual NIC. All the traffic through the JBOF is being terminated by the NIC. NICs, today, the most advanced ones are up to 200 GBps. Every NIC needs to be connected to a CPU, the CPU needs to run the NIC's drivers, the CPU needs to control the data plan and have some aspects of the control plane. For some boxes it provides services, but for many boxes other, it's just a pass-through element, so it doesn't . . . We don't see a real utilization of these x86 in this particular case. It's more of a connectivity piece. We need to fan out through a PCIe switch, into bunch of NVMes.

24:36 SN: So, the challenge that we see here is that there's not only a bottleneck in the connectivity piece, but also bottlenecks in the x86 processing, because as you try to scale your performance, remember there's like a minimum 600 GBps of throughput available from the NVMe drives in this shelf, to scale it and connect more NICs, you also need to increase the amount of cores in your x86. You need to increase your DRAM to manage all of this traffic, and this makes the system a bit more complex, a bit more expensive. We actually heard in the market that there's some designs where the back point is not almost as expensive as the amount of drives that you put at the front bay.

25:19 SN: The third element or the third challenge here is reliability. The more programmable components you have in a box, and we hear it from the hyperscales, the more it's prone to fail. What can we do to reduce the amount of programmability here? Keep it simple, keep it less component, keep it more hardwire-type of a solution.

The fourth challenge with this architecture is the scalability, because every time you want to connect another JBOF to the network, you basically consume another top-of-rack port. Those boxes cannot be daisy-chained. Enter EBOF. So, what is EBOF? Ethernet bunch of flash enabled by Ethernet SSDs. So, first and foremost, we no longer terminate the traffic for a single leak. Traffic is being terminated by Ethernet SSDs. You can think of Ethernet SSD as an NVMe SSD with another component, like a very efficient target storage NIC, which is low powered, low footprint and sits inside the SSD. This is why the SSDs can now talk, not just NVMe, but also NVMe over Fabrics, they can encapsulate or de-capsulate NVMe with the Fabric.

26:35 SN: Now, the beauty of this architecture is that now you can fan out through embedded Ethernet switch, and we talked in the previous slides about why Ethernet switches needs to be more storage-aware, and here's the example. Now, Ethernet switch can be today, Ethernet switch, you can buy from couple of vendors, they go all the way up to 12.8 terabit. If you look into the most advanced NICs, they're up to 200 gigabit per second. So, in the era of terabits, terabits of performance, Ethernet switches are more native in the way they can support all of this throughput that you purchase with your hard-earned money. NVMes are fast. But if we don't expose those NVMe drives to the network, we're basically creating inefficient architecture.

27:22 SN: Additional element here is the fact that as we move to Ethernet switch, which is a state machine, and we move to Ethernet SSDs that can be enabled for a state machine, we have less programmable entities. Again, less-programmable entities means that the system can be more reliable. And this is why we hear from the big data centers that they actually like their move to Ethernet and they like this architecture, not just for flash, but maybe also for HDDs, because it makes the cluster, the storage cluster much more reliable. The fourth advantage here is the scalability. As you can see here in the back, there's a couple of boxes, a couple of additional EBOFs that can be daisy-chained through each other. So, technically, you can just . . . You just connect the first box up to the top of rack; all the other boxes can be daisy-chained together, so we're saving on the amount of ports that you need to have in your data center. So, overall, as you take this architecture and you look into all the advantages of EBOF over JBOF, you get better performance, better utilization through this innovative Ethernet SSD architecture.

28:40 SN: Now, let me try to visualize this for you and how EBOF can actually improve the efficiency of your network. If we take a similar cost, if we take an EBOF-based solution with 600 gigabit of connectivity to the network, which is roughly the same cost as a traditional JBOF at 200 gigabit per second. You see here that the maximum that we can get out of a 600 gigabit per second is roughly 16 million I/Os. This is the network speed. As we start plugging drives into EBOF, you see drive No. 1, drive No. 2, etcetera. You see that every time we connect another drive to the network, this whole drive is available to the network. So, by the time we plug the 24th drive, you multiply it by the amount of I/Os that each drive provide to you and you get 16 million I/Os. This is what we like to call non-oversubscribed architecture, right? All the drives that you paid for, all of those big capacity drives, are now available to the network.

29:49 SN: If you take similar JBOF system at the same price point, it can provide you only 200 gigabit per second. It has compute like we discussed in the JBOF page. It has a compute and networking limitations that would allow your drives to scale up to the eighth drive. So, let's see here, we add one, two, three -- by the time we hit the eighth drive, we basically saturate the capabilities of the JBOF. So, every additional drive that we add later on, what happened here, the average performance per SSD start dropping down. By the time you plug your 24th drive to the shelf, you only get one-third of the capability or the overall performance of the drive.

30:34 SN: So, now you see the challenges with JBOF, right? You pay for all of those drives. Those drives will be higher capacity down the road; they're high performing, but you only get one-third of the bandwidth available through this inefficient JBOF architecture.

Let's try to take now a higher-level view, put it all together, how this JBOF or EBOF connect inside your data center. The traditional architectures that you have, compute, compute will run applications, VMs, containers and all the data will get pushed down through a storage target to manage all the data, to protect the data down to your JBOF. What you see here, you see the differences between the orange and green is that we convert protocols. This is additional inefficiency that exist in the data center today, and then we push it down to the JBOF, and we mentioned why JBOF doesn't expose all the storage bandwidth that you have down here. So, all in all, this architecture was beautiful for the SATA and the SAS days, but now as we move to NVMe, we create challenges. This architecture actually create a lot of challenges. Enter NVMe-oF, right?

31:45 SN: NVMe-oF has a huge promise. It's all about how we unify this transport so the compute can talk NVMe-oF all the way down to the drive, it's all orange now. It's all the same language. But we still have the JBOF limitation down here. The same issues with compute and not enough networking to expose all of those drives to the network, therefore enter Ethernet SSD to enable all of this throughput up. But the beauty of this architecture is more than just fulfilling or replacing JBOF. Now, because this EBOF enable all of your drives to be available to the network, why not move in the data protection piece into the compute, and run it as a container or as a VM? In this way, what we can do now, you can grow your compute independently of storage. There's no more bottlenecks in-between because thanks to EBOF, all of those drives are available to the network. So, if your workload needs more data, you just scale your EBOFs. If your workloads needs more compute, more GPUs as an example, big AI application, you just scale your compute. In any case, what you pay for, you get the ability to use its entire resources because through EBOF all of the SSDs are exposed to the network.

33:09 SN: Now, let's examine couple specific use cases. The first one is the emergence of SDS. So, once again, summarizing the limitations of existing JBOF, we need to convert the protocol as we go down. In this case, you talk TCP or InfiniBand, all the way to the storage target. And then you convert it to Fibre Channel protocol or iSCSI over Ethernet to talk to your drives. A lot of inefficiencies in this path.

As we look into SDS, now you can move your deduplication, you can move your encryption, you can move your compression, all to run on one of the computes that you have available in your data center. If it's not available for a specific tenant, it can be available to run specific storage management application. And by doing this, you make your entire compute infrastructure much more efficient.

Think about what happening here, you have a compute which is dedicated for storage services -- it's inefficient, it doesn't work all the time. In the case of desegregation, the SDS can take advantage over the compute which is available in your data center. Now, thanks to EBOF, you can now expose all the drives once again to the network, so all of your compute customers, all the tenants that needs piece of storage can have easy access, parallel access to all the SSDs that exist in the data center.

34:41 SN: A more exciting use case is AI, and if you look into the most advanced cluster, GPU cluster out there, they're composed out of a bunch of Nvidia's A100 silicons. Nvidia call it their DGX system. If you look into the DGX system, it's not only limited by JBOF, but also internal interconnects between GPUs that as they push data down to the drives, they are either limited . . . A100 doesn't have more than a couple of iPorts for storage. And if you want to fan it out, you need to go through some sort of x86. You go through some sort of x86, you need to use a bounce buffer and the bounce buffer . . . And Nvidia is very public about it, limit your performance to 50 gigabyte per second while a typical DGX system can give you 200 gigabytes per second.

So, you're using just one-fourth of the GPU cluster capability because of storage bottleneck. Interesting, right? So, enter EBOF. EBOF can talk RDMA. So, what's happening right now is that the GPU cluster, thanks to Nvidia's new stack, that they call GPUDirect Storage, can communicate directly with EBOF. So, we bypass the x86 that exists here, we bypass the bounce buffer that's limit the performance, and we can expose, we can provide a seamless high-throughput access to big data lakes through EBOF.

36:20 SN: If you guys have the time during the FMS show, I urge you to listen to the Micron presentation because Micron is actually going to demonstrate how EBOF perform in a GPUDirect Storage environment. Some staggering numbers in terms of improvements.

OK, let's put it altogether, right? So, the radical new approach, as we think about NVMe-oF at scale, is to ensure that the server connectivity has acceleration, TCP acceleration, networking is also storage-aware and storage is capable of Ethernet SSD. If we put it all together, once again on the left, all the limitations we addressed on JBOF, the mix of networks, the limitations in the JBOF performance. Now, if you look into a NIC, which is capable of universal RDMA or capable of TCP offloading, pushing the data all the way to an Ethernet switch that exists inside the EBOF, and this Ethernet switch is storage-aware, so it can help with discovering the drives, it can help with managing the drives, it provide a lot of services which are tuned for storage specifically. And then you have the capabilities of enabling Ethernet SSD and expose them, like in this example, we have partners today that are capable of up to 3.2 terabit connectivity into EBOF, you actually unlock all the NVMe performance to the network, and you actually overcome all the bottlenecks that we cover here on the left-hand side.

38:01 SN: I would like to thank Nishant for helping and leading the discussion today. And I would like to thank you guys for attending and listening to this exciting, radical approach of how to scale NVMe-oF, going forward. Thank you.

Dig Deeper on Flash memory