Problem solve Get help with specific problems with your technologies, process and projects.

Clustering in a SAN environment for high availability

How to increase availability by implemeting a SAN in a clustered environent.

In the modern world of 24x7 business, many companies are in search of highly available computer systems and data...

access to support customer and business needs. Putting a figure on downtime costs, International Data Corporation (IDC) estimates a typical large enterprise can lose about $100,000 an hour in the wake of a data outage. Alarmingly, 99% data availability still means a business will experience downtime. Clustered servers combined with well-designed communication and storage networks running the appropriate software applications may be the best solution for companies that require high-availability systems.

A cluster is two or more servers (sometimes called nodes) that are interconnected to appear as one. They share storage resources for maximum efficiency and cost savings. Clusters are typically comprised of two to eight servers joined together. Clustering might mean having multiple computer systems working on one task to improve performance or to address a problem too big for one system. Or, it may mean having systems readily available that can stand in for another system without losing continuity or consistency in the operating system. Yet again, it may mean that a particular application is running on multiple computer systems and the same information can be accessed at any one instance. Server clustering is gaining in popularity as more IS departments try to obtain high-end system features at commodity hardware prices. The appearance of clustering as part of Microsoft Windows architecture has also provided a new awareness of this strategy and moved many to implement clusters.

Storage area networks (SANs) are an ideal way to meet the basic requirements of high availability clusters. Separating data from the server by placing it in a SAN works well with clustering. Clustered servers must have access to the same data if they are to work together to keep an application up and running. The process of attaching, expanding, and reallocating storage among a company's many server resources is greatly simplified with a SAN. SANs also increase overall availability by allowing any single node on the SAN to be connected or disconnected without disrupting service to other nodes. If an actively accessed storage device is removed, the devices accessing that storage system may have problems retrieving data. This too is a problem that a SAN can help alleviate by providing redundant sets of data.

Some SAN switches are even being designed specifically for clustered server environments. For instance, Gadzoox Networks recently introduced the Slingshot 4210, a 2Gb SAN switch with 10 ports optimized for clustered servers. The Slingshot 4210 is an open-fabric switch that is typically installed in failover pairs to enable a high availability SAN system design. If one switch, cable or GBIC has a fault, then the host bus adapter in the server will see the path disappear and route traffic over the surviving switch. The application does not notice the difference and the application continues to run as designed.

The first step in designing a clustered environment SAN is to create high availability at the point where server and SAN connect. Optimally, there should be two Fibre Channel host bus adapters (HBAs) connected to each server. There are several choices of HBAs, the first being a fixed port configuration as opposed to a Gigabit Interface Converter (GBICs) or Small Form Pluggable (SFP) GBIC. Even though a GBIC-based adapter can carry a higher price tag, it allows for changes of media types if needed and it can act as another component that can be hot-plugged into the system in the event of a port failure. It is also important to look for the following features in an HBA: hot standby, load sharing and load balancing capabilities.

The second step is to select the switch. Each server HBA should be connected to a switch. For a beginning SAN, a low-cost loop switch or low-port-count fabric switch is an ideal building block. In SAN design, the simple implementation for high availability is to have two switches. Each switch should have the same set of servers and storage connected to it, thus providing for two independent access paths from server to data. Using at least one port on each switch to connect the switches will simplify management of the SAN, while also providing another path of data transfer should the SAN experience multiple path failures. For example, if a connection from the server to Switch1 were to fail, followed by a failure of the connection between Switch2 and the data, you would still be able to go from the server to Switch2 to Switch1 to the data.

The last step is to connect the data. As with the server, you need to make sure that each data storage unit has multiple paths of access, as well as the ability to support the same type of features as the server HBA for hot spare, load sharing and load balancing. It is also important that your data is situated according to the various RAID levels, both for speed of data access and availability. SANs provide the additional benefit of LAN-free backup. This means that when data transfers to tape, no additional traffic requirements are placed on the LAN. With a SAN, it is now possible for one of the Fibre Channel switches to control routing data to multiple target storage arrays instead of relying on the server or storage device to move data to two or more locations.

Clustering can provide businesses and departments with more function for less money. SANs as part of a cluster environment enable storage to scale and failover along with the cluster to achieve maximum uptime. A well designed storage network that includes the clustering of servers will not only help improve availability numbers, it will also help reduce the total cost of ownership.

About the author: Spencer Sells is currently manager of product marketing, responsible for worldwide efforts in development, sales, management and marketing for Gadzoox Networks' product lines. He previously worked for Amdahl Corporation as the product manager for decision support solution offerings and storage systems, and contributed to overseas operations during a two-year assignment in its corporate offices in Japan. He has a bachelor's degree in international relations from Stanford University and a master's in international affairs from Johns Hopkins University.

Dig Deeper on SAN technology and arrays