Published: 14 Oct 2002
There's an old saying that goes something like "May you live in interesting times." Just when the learning curve and comfort level on most Fibre Channel (FC) solutions has finally been achieved, a completely new crop of storage networking infrastructure technologies have popped up with a lot of hype and fanfare. They are iSCSI, InfiniBand and multiprotocol intelligent switches. Evaluating the value propositions and ROI of these new technologies can be a daunting challenge. Indeed, these are interesting times of storage networking.
|InfiniBand simplifies server I/0|
Ever since iSCSI appeared on the horizon, the hype has been deafening. Initial iSCSI enthusiasts boasted it would rule the storage area network (SAN) world, and FC would just become another victim of Ethernet road kill. That was two years ago. The claims today aren't nearly as brash and have attempted to focus on more realistic value propositions. Examining the value propositions requires a closer examination of the technology itself.
iSCSI is a block storage transport protocol - it takes the common block storage SCSI-3 protocol and maps it to TCP/Ip. This allows it to be enveloped in Ethernet packets and transported over standard LAN/WAN infrastructures. iSCSI is designed to allow block storage SANs to be deployed on IP networks.
iSCSI value propositions
iSCSI was designed to solve negative FC perceptions such as FC's high cost, complexity and interoperability issues. The proposed value propositions for iSCSI are:
1. Reduction or elimination of SAN professional service costs - the SAN is based on the same Ethernet technology most organizations already have, with a large pool of knowledge workers.
2. Lower SAN hardware costs because of lower cost copper cabling for Gigabit Ethernet (GigE) and 10/100, as well as lower switch and interface costs.
3. Faster SAN deployments requiring less IT staff time because of the Ethernet familiarity.
4. Simpler SAN management - iSCSI SANs utilize the well-known TCP/IP and Ethernet management tools.
5. Elimination of interoperability issues because of the reuse of Ethernet technologies and the tight standards of the Internet Engineering Task Force (IETF).
6. Elimination of multiple fabric infrastructures for client server and block storage - both will run on the same technology even if they require separate networks.
7. Elimination of distance boundaries for SANs because IP networks have no distance boundaries.
8. Equal or better performance vs. FC SANs.
So how valid are these value propositions?
Reduction or elimination of SAN professional services. The premise of this value proposition is that most SAN professional services are tied to FC setup and management, but that's only partially true. A key factor in the SAN setup is the upfront work with the SAN assessment. The SAN assessment determines which initiators (applications and servers) go to which targets (storage volumes), and how much storage each initiator needs. iSCSI professional services can potentially reduce the SAN setup. Most FC SANs require fairly complex zoning setups - QLogic being the exception. This would be eliminated in an iSCSI SAN and be replaced by VPNs, which are easier to setup. The value proposition accuracy: medium.
|InfiniBand Server I/O|
Lower SAN hardware costs. There are two flawed premises with this value proposition. The first is that current GigE and Ethernet product pricing will translate into lower iSCSI product pricing. The second is that FC pricing is high and will remain high. Both are incorrect. The iSCSI interfaces will require additional silicon to manage TCP/IP offload engines (TOEs), otherwise most of the CPU cycles will be utilized to read and write to disk instead of processing applications. This additional silicon increases costs, but FC costs have declined significantly. The price differential for optical interface adapters has become nominal, and there are even FC chips being put on the server motherboards for prices lower than iSCSI NICs. Even switch pricing has gotten fairly close. Category 5 is where iSCSI still has an advantage, since FC has no copper interface for 2Gb/s. When GigE and FC technology go to 10Gb/s, neither will have a copper interface. The value proposition accuracy: low.
Faster SAN deployments. The underlying premise is that the primary SAN deployment difficulty is FC complexity. The reality is the majority of that difficulty comes from making SAN-attached storage shareable. Ethernet and iSCSI do nothing to alleviate that complexity - Ethernet and iSCSI are less convoluted in how it's accomplished. The value proposition accuracy: medium.
Lower cost and simpler SAN management. The number of TCP/IP and Ethernet workers are greater than FC workers. It's likely that iSCSI and Ethernet SANs using current known management systems with extensions would be more intuitive to most IT organizations than FC SAN management systems. Even though most of the FC SAN management systems have borrowed heavily from common TCP and Ethernet management systems, they're still different. The value proposition accuracy: medium.
Elimination of interoperability issues. Today, FC interoperability issues are vendor-specific. The issue is essentially limited to FC switches and specific feature functionality, i.e., there's standard interoperability today and each vendor has value added extensions that only work with their equipment. The value proposition accuracy: low.
Elimination of multiple fabric infrastructures for client server and block storage. A total commitment to iSCSI on Ethernet for SANs would eliminate FC as a fabric to manage, providing a common technology for client server and storage networks. It wouldn't necessarily eliminate multiple fabrics - the characteristics of block SANs are different from client server, and usually require their own fabric. The only place iSCSI will have a significant impact on fabric consolidation is network-attached storage (NAS). Adding iSCSI to NAS means the file storage system can now run block storage on the same device. It not only consolidates storage fabrics, it consolidates storage devices. The value proposition accuracy: medium.
Elimination of distance boundaries for SANs. The only application driving SANs over distances larger than a campus is business continuity. Specific business continuity applications include disk-to-disk replication, disk mirroring and backup to tape or tape vaulting. FC manages this with technologies such as FCIP or iFCP (Fibre Channel frames tagged with an IP address). There's no significant advantage for iSCSI with these applications. The value proposition accuracy: low.
Equal or better performance than FC SANs. This claim was partially true when FC was at 1Gb/s. Now that nearly 100% of FC implementations are 2Gb/s, there's no truth in the claim. The other issue is the importance of low latency in SANs. GigE switches typically have multiple orders of magnitude greater latency than FC. The value proposition accuracy: low.
iSCSI provides some real capital and operating cost benefits for NAS/SAN combinations, and makes sense for organizations that don't need the performance of FC or don't want to go outside their comfort zone of Ethernet. However, it appears to have limited value for data centers.
InfiniBand is one of the most misunderstood and under-hyped technologies. InfiniBand was developed to eliminate the server I/O bottleneck. I/O has increasingly become the gating factor to increasing server performance. With the recent decisions by Intel and Microsoft to eliminate or reduce their development of InfiniBand products, there's been much confusion as to its market viability. Many in the financial community and industry believe InfiniBand will die because Intel and Microsoft have abandoned InfiniBand, but they're incorrect.
|How to calculate an ROI|
The premises are wrong because neither Intel or Microsoft has abandoned InfiniBand. Intel is still a major player in the InfiniBand Trade Association (IBTA), and is supporting developers' conferences for InfiniBand.
Intel decided not to develop its own InfiniBand silicon because of volume considerations. InfiniBand isn't a desktop or laptop technology, meaning Intel wouldn't have the volumes to justify the development. Microsoft didn't kill InfiniBand - they delayed it until the market emerges.
InfiniBand is a detailed, tight specification for serial I/O. It defines the technology for interconnecting CPUs and I/O nodes to form a fabric that's independent of the host operating system and processor platform. InfiniBand is designed to be a robust low latency architecture (hundreds of nanoseconds) replacement for the aging PCI bus.
InfiniBand utilizes shared memory vs. shared bus that leverages a concept called virtual pipes or lanes. Virtual pipes allow multiple fabrics to coexist on a single topology (see "InfiniBand Server I/O.") Performance definitions include 1X (2.5Gb/s) links, 4X (10Gb/s) links and 12X (30Gb/s) links. The native protocol is VIA (virtual interface architecture) which is an OS bypass utilizing nominal CPU cycles. The network layer is TCP/IP v.6.
InfiniBand is the first simplified I/O fabric designed from the ground up for all aspects of server I/O. Now the servers can be managed, scaled and maintained, without disruptions to either the storage or IP fabrics.
InfiniBand value propositions
1. Reduced capital and operating costs because there are fewer systems and parts purchased, implemented and managed to accomplish all the I/O.
2. Reduced complexity because of the nondisruptive plug-and-play nature of InfiniBand.
3. Reduced management costs because the I/O is managed as a central single entity.
4. Reduced infrastructure costs because there will be fewer Ethernet and FC switch ports.
5. Lower cost server scaling through clustering and lower cost DBMS clustering vs. higher cost SMP(symmetric multiprocessing) or NUMA (non-uniform memory access) machines because of lowest latency network for IPC (interprocess communication).
6. Performance future proofing because it is 10Gb now and is already specified for 30Gb later.
7. Reduced power and real estate costs because the server I/O disaggregation reduces the footprints and power requirements of every server, in addition to the reduction of actual powered equipment.
Reduced capital and operating costs. Disaggregated shared I/O means fewer adapters, NICs, switch ports and fabric switches. This should equate into reduced capital expenditures, but InfiniBand is a new technology that doesn't have the volumes of FC or Ethernet. Even though there's less equipment to buy, the capital savings works out to be negligible. These savings increase dramatically when clustering and/or operating costs are added back in. The value proposition accuracy: medium.
Reduced complexity. Disaggregated shared I/O equals less equipment. Less equipment equals simpler fabrics. The value proposition accuracy: high
Reduced management costs. Disaggregated shared I/O equals less equipment, that means less IT staff time is required to manage the fabrics saving operating costs. The value proposition accuracy: high
Reduced infrastructure costs. Disaggregated shared I/O means fewer cables, cable runs, transceivers and connectors. The value proposition accuracy: high
Lower cost server scaling through clustering and lower cost DBMS clustering. The concept is called scaling out instead of scaling up. SMP and NUMA servers are significantly more complex and expensive than clustered rack or blade servers. For clustered servers and DBMS clusters to be as effective as SMP and NUMA servers, the clustering interconnect must be very low latency and very high bandwidth. InfiniBand meets those requirements. The value proposition accuracy: high
Clear upgrade Path InfiniBand is the only released 10Gb interconnect today (4x), and the only one specified for 30Gb (12x) tomorrow. Value proposition accuracy: high
Multiprotocol, intelligent switches
This latest storage networking technology will allegedly solve all IT data center infrastructure problems. A good way to think of this technology would be director-class SAN switches, carrier class Ethernet switches, storage subsystem controllers and protocol gateways all rolled into one system.
Multiprotocol intelligent switches are similar to directors because they are high port count switches with 32 to 256 ports. Some have specifications of up to 2,400 ports in a single image. They are also similar because they have no single points of failure, have hot swappable parts and provide essentially nondisruptive operations.
One way multiprotocol intelligent switches are different from directors is that they are protocol and topology agnostic. They don't care if the SAN is FC, iSCSI, TCP/IP on GigE, FCIP, iFCP or InfiniBand. They tend to support all or most of them converting one to other and back again at wire speeds. They are also different in that they tend to have storage applications integrated in the switch. These applications typically include some mix of storage virtualization, volume replication, remote mirroring, NAS and application-aware storage. Many of these architectures have processors on every port. This additional functionality comes with a very steep penalty. Higher costs in equipment, maintenance and fabric latency. The proprietary nature of the equipment also typically means a vendor lock out environment.
Intelligent multiprotocol switches value propositions
1. Reduced downtime costs because of nonstop operations.
2. Reduced capital and operating costs because of fabric infrastructure consolidation.
3. Reduced infrastructure management costs because of one seamless multiprotocol fabric.
4. Reduced storage system costs because storage applications in the fabric mean backend storage can be less expensive systems such as JBOD.
5. Reduced storage management costs because of the centralization of storage functions such as disk-to-disk replication, remote mirroring, snapshot, etc.
Reduced downtime costs. The highly available switches means that key switches involved with fabric uptime is in the five nines category with less than six minutes of downtime per year. The high availability is provided for all fabrics that are connected. The downside is that disruptions in one fabric can cause disruptions in other fabrics; however, most of the products available and in development prevent that from happening. The value proposition accuracy: high.
Reduced capital and operating infrastructure costs. This is based on the premise that by consolidating storage networking infrastructures on a single multiprotocol switch it reduces or eliminates redundant switches, adapters, gateways, etc. Less hardware means reduced management. Centralizing on a single fabric switch infrastructure means one management system. However, the higher cost per port and higher maintenance pricing - usually based on a percentage of MSRP - typically means the savings from consolidation are nil. The value proposition accuracy: low.
Reduced infrastructure management costs. The premise is that smart switch is the single switch infrastructure for all protocols and fabrics. This is unlikely to be a common environment. Most IT operations are already using multiple fabric switches, which most likely won't be replaced. This means infrastructure management won't be reduced and potentially increased. The value proposition accuracy: low.
Reduced storage system costs. By moving some or all of the storage functions into the fabric switches, the perception is that the cost of the disk storage will decrease. The concept is that the intelligent switch becomes a very large scalable storage subsystem controller. It connects the back-end disk on the same fabric as the front-end servers. The premise is the combination of intelligent switches plus JBOD is cheaper than standard fabric switches and storage subsystems with RAID. This premise ignores the ongoing price decreases of storage systems with newer functionality and technology. The value proposition accuracy: low.
Reduced storage management costs. This is tied to the premise of moving some of the storage functionality from the subsystems to the fabric and centralizing it. This provides a single methodology for many business continuity applications (disk-to-disk replication, snapshot, server-less backup, remote mirroring, etc.,) sometimes NAS, and even volume allocation for heterogeneous storage. Unfortunately, unless it performs all of the storage management functions (such as RAID 0, 1, 0+1, 3, 5, disk defragmentation, etc.), it complicates the management by requiring multiple management points. Most of the products in development do not claim all storage subsystem functionality. The value proposition accuracy: low.
If high availability is critical, the value of these intelligent, multiprotocol switches is high. However, the other value propositions are low. And, there's one other caveat: These switches usually have measurably higher latency than fabric-specific switches. Of course, latency and block storage means less performance.
Before introducing any new technology, always qualify and quantify the value propositions. Then calculate the ROI using simple common sense techniques. As this evaluation of three new emerging technologies demonstrates, the value proposition accuracy varies significantly. In general, it's high for InfiniBand, medium to low for iSCSI and low for the intelligent, multiprotocol switches.