Last issue we investigated the three different topologies available to the SAN Designer: point-to-point, arbitrated...
loop and fabric. Fabrics are the key to most contemporary installations, and inter-switch links (ISLs) are crucial to scaling a fabric as your SAN grows.
|Multiple ISLs increase fabric performance|
With two 2Gb/s ISLs between two switches, aggregate bandwidth is increased to 4GB/s.
Connecting those switches will require ISLs, which are virtual channels that carry command, data and status information between the initiator and target. They also carry inter-switch (Class F) traffic between neighboring switches. There are eight virtual channels (0-7) per ISL. Each virtual channel is a physical circuit between the E_Ports (expansion ports) of connected switches, and each virtual channel can be configured to accept a certain class of traffic (see "Best Practices," May 2002) at one of four priority levels (see "A channel guide to ISLs," this page).
The virtual channel approach automatically partitions the ISLs' available bandwidth. Each frame destined for the ISL can potentially traverse over any available circuit that supports the class of traffic identified in the frame header. This partitioning and prioritization of the ISL yields optimum performance and more predictable results when scaling the fabric. Note that the physical circuit supporting each virtual channel manages its buffer credit algorithm much in the same way that buffers are managed between an N_Port and E_Port conversation.
Scaling a single switch to even a two-switch configuration should be approached from a standpoint of availability and performance, with regards to the application's use of the ISL configuration. For all intent and purposes, the ISLs that exist between switches are an extension of the data bus. Before SANs, you didn't need to think much about design when connecting a SCSI initiator to its target. But in today's SANs, these are precisely the areas you need to concentrate on to ensure an appropriate SAN design.
From an availability perspective, when scaling a SAN beyond a single switch, the minimum number of ISLs is two. When switches are joined via ISLs, the supporting fabrics converge and become one - if the values of their fabric operating system variables are equal. From that point on, not only will the two switches communicate with one another using Class F traffic over the ISL, but initiators and targets that are not connected to the same switch will also communicate with each other over the ISL.
If this design was being supported with one ISL, and the GBIC supporting the E_Port fails, then the one fabric composed of two switches will be reduced to two fabrics composed of one switch each, resulting in a split-brain effect. At this point, not only will the two switches be unable to communicate, nor will the initiators and targets that are not on the same switch. With two or more ISLs, the supported applications will benefit from a higher level of availability and performance.
Depending on your hardware, each ISL will operate at either 1Gb or 2Gb/s. Simply stated, the greater number of ISLs between neighboring switches, the greater the available bandwidth (see "Multiple ISLs increase fabric performance," this page). However, this also increases the cost of ownership, because you are sacrificing ports that would otherwise be used to connect initiators or targets.
Before determining how many ISLs the application requires between neighboring switches, it is necessary to review the traffic patterns of the targeted application with regard to the design of the entire SAN. For example, if during peak periods the targeted application would consume the entire bandwidth of the configured ISLs and no other applications will be utilizing the ISLs during this time, then this configuration may very well support the application. However, if multiple applications are sharing the ISLs, you must account for the traffic patterns of the second, third or fourth application in your design as well.
This is not to say that each application should or must be afforded its own ISLs - a group of ISLs could be configured to support communications between a number of initiators (applications) and their targets (disks or tape). Nor is it absolutely necessary for the aggregate bandwidth of the ISLs to meet or exceed the peak utilization of the combined applications if each application doesn't experience its peak utilization at the same time. If the targeted applications don't experience peak utilization at the same time, but their aggregate bandwidth needs exceed the capabilities of the ISLs, the ISLs are said to be over-subscribed.
|Making a mesh of it|
The initiators connected to switch-1 have a direct path to the targets on switch-4 via an ISL connection to switch-4, and an indirect path via the ISLs between switch-2, switch-3 and switch-4. However, according to the FSPF protocol, only the route between switch-1 and switch-4 will be installed in switch-1's routing table. Should the ISL between switch-1 and switch-4 fail, then a hold-down time period of 650ms is enacted before the route with smallest link cost is removed, and the second shortest route is installed in switch-1's routing table.
Connected switches utilize the Fabric Shortest Path First (FSPF) protocol to determine the optimum path between the initiator and its target after the initiator has requested delivery from the fabric. FSPF was authored by Brocade Communications and is considered the de facto standard by many standards' bodies. Understanding its internal operations isn't that much of a leap from the interworkings of the Open Shortest Path First (OSPF) protocol used by Internet routers.
Neither the initiator, nor the client has any idea of FSPF. It is contained entirely in the fabric, and utilized by the switches that make up the fabric. Depending on the ISL layout of your SAN solution, your application's availability and performance matrix will be affected when you consider FSPF operations.
With FSPF, only the shortest paths from initiator to target will be installed in the FC switch's routing table. For example, if you have four switches connected in a mesh configuration, each path or route is given a weight or cost depending on the speed of the link, and the distance (number of hops) between the initiator and target (see "Making a mesh of it"). So, the bandwidth between switches for any given initiator/target connection is not the aggregate, but the best single link as determined by FSP.
As you can see, connecting FC switches is not as simple as plugging in a fiber optic cable. Much thought should be invested into the effected application's traffic patterns before deciding on a standard ISL design for your enterprise SAN. However, your mental investment and disciplined approach to ISL design will prove to be worthwhile when you are able to scale your performance with the number of ports effortlessly.