Problem solve Get help with specific problems with your technologies, process and projects.

Troubleshooting complex SAN environments

As storage requirements increase and companies find themselves trying to support multiple "SAN islands" of data, trying to troubleshoot connection issues becomes more complex.

Much of SAN infrastructure in place these days consists of a hodgepodge of components from different vendors in complex topologies that arise from companies implementing new technology concurrently with existing solutions. Your backup solution may use hubs and be based on Fibre Channel Arbitrated Loop (FC-AL), while your disk subsystems use switches and might be based on Fibre Channel Switch Fabric (FC-SW). Connecting different generations of equipment together, including different protocols can prove difficult at best.

Here are a few tips that may make troubleshooting these complex SAN environments a bit easier.

The number one issue I have seen by far is vendor incompatibility. Since standards from SNIA (Storage Network Industry Alliance) and the FCIA (Fibre Channel Industry Alliance) are just now coming into being, most switch vendors implement their own version of a "name server" in their firmware.

This may cause interswitch links to be a problem when connecting two different vendors' switches. The moral of the story here is to try and stay consistent when choosing a vendor for your infrastructure. Even market share leaders may have issues when connecting older version switches with newer ones as standards move forward.

Try and choose a switch vendor whose firmware is upgradeable to newer versions as new standards become available.

When troubleshooting complex SAN environments, always start at the switch. There is a command called "SwitchShow" you can use when telneting into a Brocade switch that will show all the World Wide Names the switch sees on all ports, and whether those ports are logged on as "F-Ports" (Fabric) or "FL-Ports"(Loop). You may have an incorrect HBA driver loaded on a system that logs into the switch incorrectly, but no indication of this is apparent.

Most switches support port autosensing, and can speak either FC-SW, or FC-AL. When a device logs into the switch, the port light turns green if there is a successful login. The problem here is that your storage may be talking Fabric, but the host may have logged in as a Loop port. Although Fabric devices can communicate with Loop devices, the reverse is not always true (unless the vendor provides a mapping protocol).

The way to check this is to log into the switch and determine if the port is in Loop or Fabric mode. If the host driver or HBA is at fault, fixing the problem and rebooting may not solve the issue. If this is the case, try removing and re-inserting the GBIC (Gigabit Interface Connector) in the switch. This will cause the port to be reset, and eliminate any affinity for port type.

There are many other simple tips that may make your life supporting complex SANs easier. Your vendor's service personnel are probably a good resource to "pick-their-brains" for techniques they have developed.

About the author: Christopher L. Poelker is a systems engineer and storage architect at Hitachi Data Systems. He holds MCSE, MCT, MASE, and A+ certifications and has spent 21 years at Digital and Compaq, 18 years in services, and 3 years as a SAN architect.

Dig Deeper on Storage management tools

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.