THE 24 X 7 ORGANIZATION
Server clustering boosts reliability, eases failover
System downtime is no longer just an internal matter, thanks to the advent of business on the Web. Clustering is one option for keeping servers running smoothly.
By Edward Hurley, TechTarget
In this age of e-commerce and Web-enabled applications, companies are turning toward clusters of interconnected servers for the scalability and reliability needed to compete. Server clusters are a broad category of systems that can range from computers sharing storage to groups of servers that can "failover" or redistribute the workload of one to another with the help of special software.
One of the major advantages to clustering is the ability to increase computing power by adding another server or "node" to the cluster. This ability came in handy for the online pharmacy Drugstore.com. Compaq had designed a system that would accommodate up to 3 million hits, the projected maximum for the site. However, 10 million people hit the site shortly after launch. To keep the site online, Compaq rotated in more servers, said Mark Silverberg, Compaq's technical marketing manager for the company's high-performance server organization.
SPONSORED BY: EMC
Accelerate Information Access with New Blueprint from EMC, Cisco, and Oracle
The latest in a series from the ECOstructure (EMC/Cisco/Oracle Infrastructure) Initiative, the new "Accelerated Blueprint" describes best practices and optimized configurations for the fast, secure, reliable access you need for growing Web environments--and across all tiers of your enterprise information infrastructure.
Learn how your organization can most efficiently design, implement, operate, and maintain an accelerated access infrastructure through a combination of static and dynamic content caching, storage area networking (SAN), and network attached storage (NAS) solutions. Get more information, related links, and download your free copy of the ECOstructure Accelerated Blueprint.
Reliability is another advantage that arises from clustering, because some clusters include management software that reallocates the workload of a server that fails. This can help to minimize downtime, which can be especially costly for companies who guarantee their customers system availability. Such is the case with Acxiom Corp., which helps companies analyze their customer data. "We are truly required to be a 24x7 operation as we guarantee 99.99% availability to our customers," said Tim Donar, the senior system architect of the Little Rock, Ark.-based company.
Moreover, clustering software has progressed to a point where each cluster can be viewed and supervised as a single system, Donar said. Acxiom used to limit its clusters to eight nodes because each had to be managed separately. For example, adding user account information used to require entering the information on each server in the cluster. Now it's just a matter of inputting information once, Donar said.
Perhaps the simplest example of a server cluster is two servers that are connected so that one mirrors the other and would step in during a failure. This complete redundancy is expensive but is a consideration for businesses that can't afford to have a system failure, said Brian Richardson, program director of open computing and server strategies for Stamford, Conn.-based Meta Group Inc.
"You need to do an assessment of downtime. A cluster is a lot like insurance. You don't want to spend more on it than the property you are protecting is worth," said Richardson.
Yet system downtime is no longer just an internal matter. "What was a simple outage has become headline news," said Dan Kusnetzky, vice president of System Software at International Data Corp.
In order for failover to occur, each node in the cluster must be in constant contact with the others, usually by way of an electronic pulse or heartbeat. When a node stops emitting the heartbeat, the other nodes realize it has failed, and failover kicks in. The work is then divvied up among the other clusters or taken on by just one, depending on pre-determined instructions.
Clusters can be configured to failover in a variety of ways. For example, a node can failover to another in a different location in the event of a disaster. Some configurations have an extra node in the cluster that is usually idle. In cases of failover, the idle node takes over and the cluster capacity isn't compromised. Such functionality also comes into play during routine maintenance. For example, one node can be taken offline and its work shifted to another server for upgrades or maintenance.
Virtually every server operating system can support clustering to some extent. Generally, the choice of Unix or Windows, Linux or mainframe operating systems will be guided by the applications architecture, said Richardson.
Windows NT servers can failover if one goes down, but clusters are limited to only four nodes. By contrast, a cluster of IBM mainframes can completely share system resources and appear for all intents and purposes as one system. Such clusters, however, can cost millions of dollars.
Brisbane, Calif.-based TurboLinux offers clustering software that allows servers of differing operating systems to be in a cluster. TurboLinux Cluster Server takes advantage of the Linux operating system, an open source Unix cousin. Thanks to its shared ancestry with Unix, Linux is very stable. It is also cost effective because traditional licensing fees don't need to be paid to a software vendor, said Ly-Huong Pham, the company's chief operating officer.
SearchEnterpriseServers.com has information on clustering for various types of servers.
SearchSystemsManagement.com's Best Web Links collection includes a section on Configuration Management.