What to do when HDLM creates an error

We have two SAN fabrics (for redundancy) and are using Compaq ProLiant servers. Our storage is an HDS 9960. We want to use HDLM for failover on the HBAs on the servers and Compaq FCA-2101 adapters (Emulex LP950). Now we have the following problem.

After installing HDLM, everything seemed correct. Before I saw all the disks one time but in HDLM I see all disks twice. If I get a failure on one HBA, nothing happens and service is running properly. But, if I get a failure on the other HBA I loose all disks. I changed the HBAs but I have the same problem.

We are using W2KSP3 and HDLM 4.0/C with MSCS. Can you help me?

First of all, place a service call so one of the HDS services folks can help you out.

HDLM always tries to automatically place a failed path back online. Check your error log and also check the logs in the fabric switches. It may be that the for some reason, the failed path cannot be placed back online due to other problems in the path.

The last online path for a device is never placed offline even if an error occurs or even if the user manually tries to set it offline. This ensures continuous access to the LUNs. If an error occurs in the last online path for a device, HDLM checks the status of other paths that are offline. If a path can be placed online, HDLM places that path online and switches to it.

Make sure that each path is using a different fabric (switch). If the paths are tied together via a switch ISL port, then you should place each path in a separate zone.

The automatic path healthcheck and path failback are turned off by default. To turn them on, you need to set the parameters. Open up a command prompt and type these commands to turn on auto path failback and path health checking:

>dlnkmgr set -pchk on -intvl 10 (turns on path healthcheck, and checks every
10 min. value is between 1-30 min.)
>dlnkmgr set -afb on -intvl 15 (automatically turns on a failed path if
possible, and tries every 15 min. value is between 1 -1440 min).

If you have not turned these functions on and you lost your first path, when you lose the second one all paths are unavailable. You can also manually turn a path back online by using this command:

>dlnkmgr online -pathid "path" (path is the path number, like 1 or 2)
Hope this helps.


