Published: 27 Feb 2014
Is your company's storage network architecture struggling to keep pace with virtualized server environments and...
speedy flash storage? A major upgrade might be in order.
The storage network is a frequently neglected component of a virtualization initiative or clustered database rollout. Greater priority is typically given to the servers and actual storage system for the project, instead of the network that connects them. That's typically true because in the past the storage network was more than capable of delivering adequate performance. But with the advent of high-density server virtualization and flash-based near-zero latency storage, the storage network is becoming a bottleneck and -- like it or not -- IT planners need to review their storage network upgrade options.
The three layers of a data center
A data center can be divided into three layers:
- The compute layer that runs the applications
- The storage layer that stores the data created by the compute layer
- The networking layer that connects the compute and storage layers
The compute layer has become significantly more powerful as virtualization has enabled a single server to support dozens of virtual servers, and the storage layer has become significantly more responsive thanks to flash-based storage. These are revolutionary changes that have had a dramatic impact on the data center.
The network layer in most data centers has been evolving as well, an evolution motivated more by obsolescence than by a need for greater performance or capabilities. These upgrades have typically happened gradually as the cost of next-generation storage network components such as adapters and switches continues to drop. Essentially, the storage network is typically on a slow, rolling upgrade as new servers are added to the data center.
The network performance chasm
As a result of these revolutionary performance changes in the compute and storage layers, the storage networking layer has become a performance chasm. The compute layer has the ability to generate tremendous and random storage I/O demand and, thanks to flash, the storage layer can respond to those demands accordingly. But the networking layer no longer has the latency of spinning hard disk storage to hide behind and, as a result, has become the bottleneck.
IT planners are suddenly faced with the need to upgrade their entire storage network at once, but the options can be overwhelming with alternatives such as Gen 5 (16 Gbps) Fibre Channel (FC), FC over Ethernet, 10 Gigabit Ethernet (10 GbE), 40 GbE, InfiniBand and server-side networking.
Let's compare the various infrastructure options that each of these network types provides.
IP networks (10 GbE and beyond)
Internet Protocol (IP)-based storage networks are generating a lot of interest for storage networking because there's a belief that they're less expensive and easier to maintain. After all, most data centers already use an IP-based network for user-to-server communication and server-to-server communication.
So the inclination is to use the expertise and physical assets invested in those communications to allow servers to communicate with storage. Doing so should lower the price of the storage infrastructure and its operating costs, since it wouldn't require special skills.
Most existing IP storage architectures today use a trunked or bonded set of 1 GbE connections for performance and redundancy. The move to 10 GbE is appealing because those multiple 1 GbE connections can be eliminated. But 10 GbE networks, just like the 1 GbE networks that preceded them, have some issues that network designers need to be aware of. One issue with IP-based storage networks is that most IP architectures still use the spanning tree protocol (STP), which means that only one connection can be active at a time.
The weakness of STP is that it was designed in the days of Ethernet hubs before switches were commonly available. In general, STP ensures there's only one active path between two points in the network. The goal was to create a loop-free infrastructure.
In the modern network, there are always redundant paths, but with STP they're blocked or turned off. When an active path fails, the network has to re-converge on a new path. In a large network, re-convergence can take a few seconds. As a result, not only is potential bandwidth wasted, but there's a "hesitation" in the entire network when an inactive path needs to be activated.
Having inactive paths in 1 GbE wasn't a big concern because there wasn't much bandwidth being wasted and each link was very inexpensive. The move to 10 GbE makes this a more glaring concern since the amount of wasted bandwidth is potentially 10x greater and the cost of each connection is significantly higher.
IP networking vendors are trying to create fabric-based IP networks that don't have these blocking issues based largely on Transparent Interconnection of Lots of Links (Trill). But each implementation is unique and interoperability between providers seems sketchy. In addition, the cost of the hardware for these implementations is significantly higher than their standard Ethernet counterparts.
IP networks have the overhead of translating between SCSI and IP, and the general overhead of TCP. While these issues can be overcome by using special network interface cards that offload the IP stack and help with IP to SCSI conversion, installing these cards increases the cost and complexity of a seemingly simple IP network.
16 Gbps Gen 5 Fibre Channel
Most storage professionals are very familiar with FC storage networks. Fibre Channel networks have a reputation for being more expensive and complicated than Ethernet networks. But FC is a fabric-based storage topology, so all the links are active. That means none of the 16 Gbps bandwidth of Gen 5 FC is wasted on inactive links. All links are active and there's no reset period if a link fails.
Fibre Channel is also a lossless network, which means data always reaches the destination, and there are no retries as there are with IP-based networks. Finally, no translation needs to occur, as FC transparently handles SCSI communication packets.
Fibre Channel may represent the "old guard" in storage networking, but it's the Gen 5 architecture that delivers almost all its theoretical potential, and it has specific capabilities to prioritize I/O bandwidth at the granularity of a virtual machine (VM).
Server-side networks often leverage software-defined storage to aggregate internal storage across the server hosts that make up the compute layer of a virtual infrastructure. That aggregated storage is then shared as a single pool. These forms of storage networking actually collapse the data center layers described above, but don't eliminate them. They all still reside on the same physical devices, the hosts that make up the virtual infrastructure.
Server-side networking relies, for the most part, on IP-based protocols, although some use InfiniBand. They should help with some of the IP issues described earlier because the overhead of dealing with IP communication is addressed by the CPU power within the physical hosts. Also, STP networking issues should be minimized since the connections between servers are private, requiring fewer connections.
Some of these solutions will segment data across the aggregated pool of storage. Doing so provides the data protection and data sharing needed in clustered environments. But it also introduces network latency, since there's still a network connecting the servers.
A few of these offerings mitigate this problem by allowing a virtual machine's data to be stored on the host the VM resides on and then distributed across the pool. All reads come from the local server, eliminating latency. Writes go to both the local pool and the aggregated pool, enabling data safety and sharing. Essentially, these systems give up some capacity efficiency for additional performance.
Extreme networking: InfiniBand, 40 GbE and Gen 6 FC
There are several more extreme or esoteric storage networking choices available today, including InfiniBand, 40 GbE and Gen 6 Fibre Channel. The latter two are essentially upgrades to the current IP and FC standards, providing greater bandwidth. InfiniBand, on the other hand, is a flat network similar to FC but it runs at 40 Gbps.
InfiniBand is commonly found within storage systems as an interconnect of clustered nodes in environments that demand extremely high-performance I/O, such as high-frequency trading. It's also a popular choice as a server interconnect for mirroring flash devices between servers in high-availability configurations.
While InfiniBand can certainly be made to work in many storage network configurations, especially server-side networks, the chances of it becoming widely adopted seem slim. It will remain a viable special use-case network.
There's no question that many data centers have reached the tipping point where without at least a gradual upgrade to the next-generation storage network architecture, the potential return on investment (ROI) in the compute and storage layers will be limited. Without the raw bandwidth and other capabilities these networks provide, virtualization reaches maximum density and flash storage can't deliver maximum performance.
The question is which network should be chosen for the next-generation storage infrastructure. For the most part, that decision will depend on what's in place today. Most organizations will stick with the topology and protocol they already own, and simply move to its next generation. But IT planners should consider which of the available architectures will allow them to extract maximum ROI from investments in the compute and storage layers. There should be little hesitancy about moving to a new architecture if it can justify itself with greater density and greater performance.
About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.