Multipath I/O: Guarantee redundancy during HBA failure (almost!)
By Rick Cook
What you will learn from this tip: Microsoft's Multipath I/O (MPIO) is designed to mitigate the effects of a HBA failure, by providing an alternate data path between devices. Learn why MPIO doesn't completely eliminate the possibility of problems if an HBA fails and what you can do ensure redundancy.
Technically, MPIO represents a hybrid approach to multipathing, combining software specific to each host bus adapter (HBA) -- the Device Specific Module or DSM -- with the MPIO features integrated into the Windows operating system. While the DSMs must be written for specific hardware and cannot be generic, Microsoft designed MPIO to be as HBA agnostic as possible. As such, it depends on the rest of the hardware and software to work correctly in order to do its job.
In general, this works very well, and MPIO provides an important element of redundancy, as well as load balancing, in Windows storage environments. However, there are some things you need to be aware of when it comes to MPIO and HBA failure.
The most obvious is that you have to have at least two HBAs connected to the storage device. A dual-port HBA may provide two data paths, but it still represents a single point of failure.
In a storage area network (SAN) you need to be sure that any switches or routers laying in the data path have enough path management intelligence to handle disruptions caused by an HBA failure.
In theory, MPIO will fail over in the event of a data path interruption and then fail back in a matter of seconds when the path is restored. In practice, it's not always that clean. In cluster environments, you need to be sure that the cluster can fail back to restore the data path as well as fail over in the event of a problem. Failback is a separate operation from failover and the two aren't necessarily symmetrical. A system that doesn't fail back (or worse, doesn't fail over in the first place) may have a misconfigured HBA, switch or other component, or it may have a problem with the HBA itself. Contact your hardware vendor for more information. (Microsoft discusses what happens when an HBA is unplugged and plugged back in, simulating failover and failback on a cluster.
If you're still running Windows 2000, there's another potential problem. Multiple path software may cause the disk signature to change if there is a failure. This can cause the system to fail because it can't find the disk.
In general, if you have problems related to multipathing, you're not going to be dealing with Microsoft. The HBA vendors, not Microsoft, write the MPIO drivers. Microsoft's attitude is that the DSM and the other details are implemented by the hardware and related software vendors, not Microsoft, and they are the place to go for help in troubleshooting.
Do you know…
About the author: Rick Cook specializes in writing about issues related to storage and
07 Nov 2006
Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.