Problem solve Get help with specific problems with your technologies, process and projects.

Design for high reliability

Tips for making sure your SAN will stay available.


Design for high reliability
Rick Cook

Because a Storage Area Network (SAN) usually contains a large fraction of an enterprise's storage capacity, it is important that SANs be designed and built for high availability. According to Brocade, a maker of SAN equipment, there are a number of factors in designing and building a highly reliable SAN.

One of the most important, Brocade says, is redundancy. For reliability, a SAN should have no single point of failure. That means having alternate devices, alternate data paths and a configuration that can continue to support the enterprise during a failure. Redundancy and physical separation are particularly important in the event of a disaster.

Other points mentioned include: Having simple monitoring, diagnostics and repair capabilities to ensure fast diagnosis and recovery from problems. A minimal amount of human intervention needed in the event of a failover and a reliable backup and recovery plan to cover a wide range of contingencies.

SAN capabilities that build fault tolerance include intelligent routing and rerouting, no single points of failure, dynamic failover protection, non-disruptive maintenance of servers and storage, hardware zoning and predictive fabric management.

Brocade discusses its vision of high reliability SAN design in a white paper titled "Improving System Availability with Storage Area Networks, which can be found at their Web site at Further, most SAN vendors, such as EMC and Compaq have similar white papers available on their web sites.

Rick Cook has been writing about mass storage since the days when the term meant an 80K floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the last twenty years he has been a freelance writer specializing in storage and other computer issues.

Additional Resources:
1. How should I choose RAID controllers?
This is a comprehensive article on how to choose RAID controllers and subsystems. After covering the benefits of using RAID, the author describes what to look for in RAID. He then goes into the importance of redundancy and the pros and cons of buying vs. building complete RAID subsystems.The piece also includes a RAID glossary.

2. Can I use one Fibre adapter to host both disk and tape?
A searchStorage reader poses this question to SAN expert Christopher Poelker, stating "I get asked the question these days of: 'Can I place two Fibre adapters in a system, have one for failover and/or redundancy, while having both disk and DLT tape (for backup) connected to both adapters?'"

3. RAID or JBOD: Which is right for my organization?
JBOD (Just a Bunch of Disks) is often promoted as a cheaper alternative to a RAID array for applications such as SANs that use a lot of storage. Unlike RAID, which is organized in various manners, many of which offer redundancy and error recovery, JBOD is simply treated as one or more large disk drives. While JBOD is the cheaper alternative, this tip suggests there are other factors to consider.

4. How can I troubleshoot a production fabric environment?
Many companies implement a SAN environment without knowing what is required for troubleshooting and problem determination. In some cases a problem on a fabric could indeed cause a central failure of a SAN no matter how much redundancy is built in. This user-submitted tip offers two methods for setting up a separate fabric, which is the starting point for troubleshooting.

Dig Deeper on Data storage strategy

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.