Multipath I/O: Guarantee redundancy during HBA failure (almost!)

Microsoft's Multipath I/O (MPIO) is designed to mitigate the effects of a HBA failure, by providing an alternate data path between devices. Learn why MPIO doesn't completely eliminate the possibility of problems if an HBA fails and what you can do ensure redundancy.

What you will learn from this tip: Microsoft's Multipath I/O (MPIO) is designed to mitigate the effects of a HBA failure, by providing an alternate data path between devices. Learn why MPIO doesn't completely eliminate the possibility of problems if an HBA fails and what you can do ensure redundancy.


Technically, MPIO represents a hybrid approach to multipathing, combining software specific to each host bus adapter (HBA) -- the Device Specific Module or DSM -- with the MPIO features integrated into the Windows operating system. While the DSMs must be written for specific hardware and cannot be generic, Microsoft designed MPIO to be as HBA agnostic as possible. As such, it depends on the rest of the hardware and software to work correctly in order to do its job.

Multipathing Information

Multipathing during boot

Production databases find a home on IP SANs

Troubleshooting clustered server issues 

In general, this works very well, and MPIO provides an important element of redundancy, as well as load balancing, in Windows storage environments. However, there are some things you need to be aware of when it comes to MPIO and HBA failure.

The most obvious is that you have to have at least two HBAs connected to the storage device. A dual-port HBA may provide two data paths, but it still represents a single point of failure.

In a storage area network (SAN) you need to be sure that any switches or routers laying in the data path have enough path management intelligence to handle disruptions caused by an HBA failure.

In theory, MPIO will fail over in the event of a data path interruption and then fail back in a matter of seconds when the path is restored. In practice, it's not always that clean. In cluster environments, you need to be sure that the cluster can fail back to restore the data path as well as fail over in the event of a problem. Failback is a separate operation from failover and the two aren't necessarily symmetrical. A system that doesn't fail back (or worse, doesn't fail over in the first place) may have a misconfigured HBA, switch or other component, or it may have a problem with the HBA itself. Contact your hardware vendor for more information. (Microsoft discusses what happens when an HBA is unplugged and plugged back in, simulating failover and failback on a cluster.

If you're still running Windows 2000, there's another potential problem. Multiple path software may cause the disk signature to change if there is a failure. This can cause the system to fail because it can't find the disk.

In general, if you have problems related to multipathing, you're not going to be dealing with Microsoft. The HBA vendors, not Microsoft, write the MPIO drivers. Microsoft's attitude is that the DSM and the other details are implemented by the hardware and related software vendors, not Microsoft, and they are the place to go for help in troubleshooting.

Do you know…

The differences between Multipath I/O and failover?

About the author: Rick Cook specializes in writing about issues related to storage and storage management.

This was first published in November 2006

Dig deeper on Fibre Channel (FC) SAN

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close