Ask the Expert

Avoid maintenance problems with proper SAN design

Bits & Bytes: We all know that if things were done right in the first place, problems wouldn't creep up on us, right?

According to expert Chris Poelker, it's certainly true when designing a SAN environment. In this Ask the Expert answer, Chris discusses the problems one can encounter with a poorly designed SAN, unimplemented change control policies and the importance of flexibility.

Requires Free Membership to View

A reader recently asked Chris the following questions:

In a SAN environment where you have a significant number of servers sharing a few switches to get to large storage devices, it is no small task to coordinate outages to implement switch firmware upgrades. Do switch vendors take this issue into account when determining how often upgrades come out and how long existing versions are supported?

Here's what Chris had to say in response:

Not as far as I have seen. The vendors come out with new firmware versions on a regular basis to either fix bugs or add new functionality. Your experience with outages could mean your SAN environment was not properly designed to start or you have been adding to the SAN with no policies in place for change control.

What you are experiencing is the reason why proper SAN design MUST take into account scheduled and unscheduled maintenance procedures and be flexible enough to withstand multiple component failures. If you have a "significant number of servers sharing a few switches to get to large storage devices", then you need to plan for congestion. Make sure your fan-in-ratio is in agreement with current standards for the bandwidth you are using and the ISL links are properly balanced to take the load. Use trunking if possible.

For those servers where you are finding pain by having to plan for downtime, make sure you have at least TWO host bus adapters in each server with path failover software, connect each path to storage across TWO separate fabrics and assign volumes to the servers from TWO separate storage ports. Doing so will provide you with a resilient design, that will enable automatic path fail over by the path fail over driver as you take down each fabric, one at a time, to perform maintenance. Also, make sure your storage provider provides the capability for "online microcode loads", so you have zero planned downtime.

Downtime adds up and increases the operating costs of your storage solutions. Always take a close look at this when deciding on future purchase decisions. Customers need to drive the need for zero downtime to the vendors and the vendors get the message when your storage dollars are going to their competitor!

Chris

Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.


This was first published in May 2003

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: