Your drivers are kind of old... The latest driver version for the LP8000 is 5-4.82a4 for Windows 2000 and 4-4.82a4 for NT.
I have seen some weird interactions with different driver versions and versions of microcode on the switches. I would need to know the version of switch microcode you are using to determine whether this is an issue.
I see you used the switches and the storage array as a resource in determining the problem. That's always the best approach to start. Always troubleshoot a SAN from the switch out. One thing I have seen that causes things like this to happen, which you may have overlooked, are the cables. I assume you are using 50-micron cables, and they are no longer than 500 meters. I also assume you are not using patch panels, and if you are, there is no more than 4db signal loss on the cables. Cable problems can cause all sorts of screwy things to happen. If a cable has a micro bend (it's pinched) or macro bend (it's looped too tight), the light will have problems traveling through the cable, and you will have signal degradation. If you look at the error logs on the switch ports, you may see multiple FLOGI requests, or port timeouts. This would indicate cable issues if the driver error log reports no errors.
The new switches out there have some pretty cool fabric management software that can be used as a tool for troubleshooting intermittent issues like the ones you are seeing. Go to your switch vendor's Web site to find out about this stuff.
Check your error logs on the system too. Queue depth issues can be file system related. Make sure you have enough system memory on your servers and that the file system is acting properly. One good way to determine this, on NT, is to run IOMETER (from the Intel Web site) to your storage array from multiple servers and determine if a particular server is experiencing the problem. This will take the application out of the loop.
As a last resort you can call in someone, a consultant or your storage vendor with whom you have a maintenance contract perhaps -- use them, who has access to a Fibre Channel protocol analyzer. The problem here is that if the problem is intermittent, you may have trouble trying to determine where in the path to hook it up.
Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.
This was first published in November 2002