Published: 13 Oct 2003
Last month, we discussed transport issues for remote disaster sites (See "Linking SANs for disaster recovery, Part 1"). Once you have decided on the transport's physical infrastructure, deciding on the appropriate protocol to pump data to the other side will be your next major decision.
If your requirements dictate and your budget allows for dense wavelength division multiplexing (DWDM), then you probably want to maintain native Fibre Channel (FC) communication between the initiator and target storage area network (SAN) device. If you've chosen TCP as your WAN transport mechanism, you have three choices for protocols:
- Fibre Channel over IP (FCIP)
- Internet Fibre Channel protocol (iFCP)
- Internet SCSI (iSCSI)
|How iFCP works|
|iFCP gateways can connect multiple SANs to multiple SANs through an IP cloud. Each sending SAN switch sends packets (1) to the iFCP gateway, which encapsulates them (2) and tracks both the IP and FC addresses of all the attached nodes. The receiving gateway strips off the IP packet (3) and sends the FC payload to the appropriate SAN (4) on the receiving side.|
|How FCIP works|
|FCIP creates a direct connection between SANs. The FC packet (1) is sent to the FCIP bridge, where it is encapsulated within an IP packet (2) and sent over an IP link to the receiving bridge, which strips off the IP header (3) and sends it to the receiving FC switch (4).|
FCIP: Tunneling through the cloud
FCIP is a tunneling protocol designed to provide an IP transport between SAN islands. When a local FC port needs to talk to a remote FC port, the FC frame is sent to an FCIP device, which in turn encapsulates the frame in a TCP/IP header and sends it across the transport to the IP port of the FCIP device at the other end (see "How FCIP works"). Upon receiving the frame, the TCP header is stripped and the frame is delivered to the destination FC port. Neither network (FC or IP) is aware that they are utilizing or servicing the other--the FCIP device is solely responsible for the translation.
Although the user datagram protocol (UDP) can be used to transport packets across an IP link, TCP is the most widely used transport because in-order delivery, data integrity and congestion control are desirable on the WAN. Because TCP is the desired transport, the line quality of the WAN link is of the utmost importance to keep retransmissions at a minimum level.
The size of the WAN pipe ranges from OC-3 (155Mb/s) to OC-192 (10Gb/s). As you collect data in the application assessment phase, consider the burst rates and times of your applications, as well as over-subscription calculations to arrive at your required bandwidth between sites. Also, you should provision additional capacity for time-sensitive applications to minimize or eliminate latency during such operations as synchronous disk activity.
Because the FC frame isn't traversing the network under its native header as it would with DWDM, the FC frame and the IP transport are each subject to the other's timeout values. Additionally, when there isn't any application data flowing over the wire, the IP ports on the FCIP devices will send keepalive packets to each other at a configurable rate. This improves the monitoring function of the link and helps to maintain the connections between the FCIP devices.
The E_Ports between FC linked by FCIP are unaware that there's an IP tunnel between them. They exchange F Class traffic as if they were directly connected to one another. And for all intents and purposes, they don't care as long as their frames are delivered and their acknowledgements are received before expiration timers are fired off.
If the IP tunnel is lost due to a hardware or carrier failure, the fabric will segment in the same way as if the E_Ports were directly connected. However, when the IP tunnel is recovered, at least one of the switches joining the fabric across the long-distance SAN link will need to be reinitialized. This can be fixed by your FCIP switch vendor.
Flow control is implemented and used by both FCP and TCP on both sides of the FCIP device. If you're using Class 3 as your FC transport mechanism, then flow control is implemented in receive ready (R_RDYs ) frames or buffers where buffer credits are allotted to the sending port as R_RDYs are returned from the recipient after having emptied its buffer of the previous frame. If however, the IP side of the FCIP device needs to send packets over the IP transport, its sending will be governed by the advertised sliding window of its IP partner on the remote FCIP device.
Management of the encapsulated FC frames that traverse IP transport will need to be performed with IP network management tools. These tools won't be able to peek inside to determine the characteristics of the FC frame; therefore, you will need a combination of the FC and IP network management tools to get a complete picture of your extended SAN. The same isn't true for extended SAN solutions completely engineered around IP (i.e., iFCP and iSCSI).
iFCP: improved error isolation
iFCP is a peer-to-peer gateway IP protocol that provides connection services to FC devices over an IP transport using TCP for the same reasons as I mentioned earlier. SAN islands or individual devices are bridged into the IP network using an iFCP gateway device (see "How iFCP works," on this page). Once bridged, FC devices are assigned an IP address by the gateway for communication across the TCP transport. Thus, after the address is assigned, the FC device will have an ARP-like entry in the iFCP device that refers to both its native FC address and its foreign IP address.
iFCP vendors are able to replace FC switches because their iFCP devices intercept communication coming from locally attached FC devices and either pass the frames to a locally attached FC destination or well-known services, or establish a TCP/IP connection to a remote iFCP gateway and then ship the communicated frame across the IP transport.
Central to the initial discovery and subsequent communication between devices in an IP SAN is the Internet storage name server (iSNS). Similar in functionality to the name server in FC networks, iFCP or iSCSI devices must register their characteristics with an iSNS server residing either in an IP storage device or possibly a centralized iSNS server. Then the device is added to a discovery domain (zone), and is only accessed by the devices in that domain. As with FC SANs, state change notifications are sent to each registered device in the discovery domain when changes are made within the domain.
iFCP has two strengths that aren't evident in FCIP:
- FC protocol errors are contained to gateway region being managed by the iFCP gateway
- TCP connections and flow control are managed at the device level
However, with proper zoning, state change notifications (SCNs) will be confined to their respective zone members. As for flow control, because FC devices are assigned IP addresses in iFCP, congestion is managed separately between the communicating pairs. With FCIP, congestion is only managed between the FCIP devices extending the fabric, and not the individual devices. However, this is only likely to be a problem if the bandwidth specifications for both the FC and IP sides of the FCIP device don't match.
iSCSI: Management challenges
Of the long-distance SAN link solutions, iSCSI products have been the slowest to come to market. Perhaps this is because the protocol itself is a complete reworking of how servers access their storage. Both FCIP and iFCP borrow from the FC specification for communicating between end nodes. iSCSI, on the other hand, implements its own protocol stack for the transmission of block data.
As a result, iSCSI vendors had the opportunity and decided to include all the requisite intelligence in the iSCSI interface and thus enable the use of generic IP platforms to transport data over the SAN. This is contrary to FC SANs, where much of the intelligence is in the switching infrastructure. Here, I see a potential for increased administrative burden unless tools are developed to interrogate iSCSI interfaces across the SAN for possible problem resolution. Each device on an iSCSI SAN is outfitted with an iSCSI interface connecting it to the SAN. Unless the SAN administrator has access to the logs of the device drivers on each node from a central point, you will need access to each node point to resolve problems.
Addressing in iSCSI is based on a hierarchy of network identities. At the highest level, there is a network entity identified by an IP address and port number. Underneath the network entity is the iSCSI node name, which is individually managed and can be as long as 255 bytes in length. In this case, the network entity may be the controller on a storage array, and the iSCSI node names the individual disks that sit behind the controller.
As mentioned earlier, iSCSI devices will register with the iSNS server and initiators will discover their targets within the domain. Communication paths are created between the initiator and target through a login process similar to FC. Operating parameters are exchanged and agreed on before a login is accepted.
One challenge with iSCSI is that the SCSI is intolerant of errors on the transport. Put that together with the susceptibility to loss in IP networks and you have a problem. With this in mind, implementing an iSCSI long-distance SAN link requires the highest quality of IP connectivity between sites.
Another challenge is the CPU overhead on iSCSI nodes while processing packets. To address this issue, manufacturers of iSCSI interface cards have added TCP offload engines to handle much of the interrupt processing associated with sending packets on the IP network. Still, you should test the raw performance of a potential vendor's iSCSI interface card before moving off an existing technology or selecting a new one.
The decision tree
Clearly, many factors go into deciding among the three IP storage options. Ultimately, it comes down to this:
- FCIP is simpler than iFCP in terms of the number of devices as well as in the protocol itself. For a SAN with a core-edge design looking to mirror data to a like-SAN with a relatively few number of devices, then FCIP is the quickest, most economical solution available.
- If your data center is populated with a number of SAN islands that support different applications, then perhaps you want to leverage the error and congestion isolation benefits of iFCP.
- On the other hand, if you don't have a SAN yet and have the leisure of being able to wait and see how iSCSI continues to fair in the marketplace before making FC investments, then iSCSI may be the best opportunity for you to answer corporate mandates for business continuity and at the same time address the management issues of managing multiple protocols over your long-distance SAN transport.