Home > Storage Technology Tips > > How to establish a strong failover environment
Storage Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 


How to establish a strong failover environment


Evan Marcus and Hal Stern
12.29.2003
Rating: -2.50- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


When a computer system fails, it can take hours, or in some cases days, to diagnose the failure. If the failure is an intermittent one, it can take even longer; some intermittent problems are never reliably diagnosed.

If it turns out that the source of the problem is hardware, a replacement for the failed part must be obtained, and then someone who is capable must be called upon to replace it. If the problem is in software, a patch to the application or to the operating system must be obtained, if it even exists (it may have to be written first). Assuming the fix works, the host must be rebooted, and recovery must be initiated from any damage that the failure may have caused.

Sometimes you'll find yourself in the finger-pointing circle game, where the hardware vendor blames the OS vendor, who blames the storage vendor, who blames the application vendor, who blames the hardware vendor again. All the while, of course, your system is down. If the failed server is a critical one, this sort of hours- or days-long outage due to vendor bickering is simply unacceptable.

What can you do? You could take your applications off the Unix or Windows server you've installed them on and put them on a multimillion-dollar, fault-tolerant server, instead. FT servers are designed with redundant hardware so that if one component fails, others can instantly step in and take over for them. (FT servers are often designed with triple-redundant hardware, and there is at least one quad-redundant system on the market!)

Unfortunately, a fault-tolerant (FT) server may still not offer adequate protection. Although the FT vendors may make enhancements to their drivers and operating system, FT systems are no less vulnerable to software issues than more conventional systems. What's more, by their nature, FT systems are closed systems that do not offer the flexibility or connectivity of conventional systems, because those benefits can introduce risk to the system. It is difficult to migrate existing applications to FT systems, because they are not always compatible with conventional systems. FT systems are popular in certain high-end applications, such as gaming (casinos and lotteries), and air traffic control, where the benefits that they provide offset their cost.

A more practical and less expensive solution is to take two or more conventional servers and connect them together with some controlling software, so that if one server fails, the other server can take over automatically. The takeover occurs with some interruption in service, but that interruption is usually limited to just a few minutes.

The migration of services from one server to another is called failover.

To ensure data consistency and rapid recovery in a failover situation, the servers should be connected to the same shared disks. This series of tips will assume that the servers are located within the same site, and generally in the same room. (Migrating critical applications to a remote site is a disaster recovery issue. While it seems similar to the local case, it actually introduces many new variables and complexities. This will not be covered in this series, but can be found in Chapter 18, "Data Replication, of the book, "Blueprints for high availability, second edition.")


Content in this tip has been excerpted by permission from the book, ""Blueprints for high availability, Second edition," authored by Evan Marcus and Hal Stern, Wiley Pug blishing, Inc. All rights reserved.

About the authors: Evan Marcus is a frequent SearchStorage.com contributor and an expert at answering readers' questions related to availability, backup and disaster recovery-related issues. He is also a principal engineer for Veritas Software and the industry's data availability maven, with over 12 years of experience in this area. He is also a frequent speaker at industry technical conferences.

Hal Stern is the vice president and chief technology officer for the Services business unit of Sun Microsystems. He has worked on reliability and availability issues for some of the largest online trading and sports information as well as several network service providers.

Do you have a question for Evan Marcus? You can find him in our High Availability category.

Rate this Tip
To rate tips, you must be a member of SearchStorage.com.
Register now to start rating these tips. Log in if you are already a member.




BROWSE BY TAG
Data Protection,   Disaster recovery and planning,   VIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Disaster recovery and planning
Backup in a snap: A guide to snapshot technologies
Storage Decisions Chicago 2009 Session Downloads
Storage Decisions Session Downloads: Disaster Recovery Track (Chicago 2009)
Storage Decisions Session Downloads: Data Retention & Retrieval Track (Chicago 2009)
More testing, more confidence for DR plans
The under-over on DR
Best storage Products of the Year 2008
Disaster recovery site options
DR for virtualized servers
Storage Decisions San Francisco 2008 Session Downloads

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
application-aware storage  (SearchStorage.com)
Backup and recovery: Do you speak geek?  (WhatIs.com)
bare metal restore  (SearchStorage.com)
cold backup  (SearchStorage.com)
continuous data protection  (SearchStorage.com)
hot backup  (SearchStorage.com)
online backup  (SearchStorage.com)
recovery  (SearchStorage.com)
recovery point objective  (WhatIs.com)
recovery time objective  (WhatIs.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts