Disaster recovery: Test, test and test some more


This article can also be found in the Premium Editorial Download "Storage magazine: Upgrade path bumpy for major backup app."

Download it now to read this article plus other related content.

Most DR plans are constructed around recovering applications. The list typically starts with critical business applications like the enterprise resource planning, CRM or manufacturing execution system, followed by lower priority applications like the company's public Web site.

The DR test of redundantly run apps consists of three steps: Initiating the failover, business-user verification of transactions and application consistency, and switching back to the primary application instance. The easiest failover scenarios to test are symmetrically load-balanced applications like a Web site or grid clusters; the only impact of disabling nodes--as long as the failover works--is an increased load on the remaining nodes. A performance impact analysis should be part of the test.

Transactional applications like databases or Exchange can't be load balanced as easily and require a cluster-type application like Microsoft Cluster Service, CA XOsoft's WANSync or products from Neverfail Group. These products continuously replicate transactions and changes to the failover system, monitor availability of the primary application and perform an automatic failover in case the primary instance fails. DR testing of clustered applications is disruptive, as the primary instance can't be used while the test is performed. This was the reason for

Requires Free Membership to View

Chaffe McCall's Zeller to choose the CA XOsoft WANSync product family.

"[CA] XOsoft's Assured Recovery enables us to conduct comprehensive and regular DR tests to standby servers without any disruption to our production servers or any interruption to the disaster recovery protection," explains Zeller.

Recovering applications without a failover solution depends on restoring the app from backups. To avoid Zeller's experience during Hurricane Katrina, your DR test needs to ensure that the correct hardware, operating system and application software are available and that the recovery can be successfully performed on the designated recovery systems. The test needs to ensure that all required components, especially the application software, have a valid maintenance contract and valid software protection codes (SPCs) if required by an application. SPCs bind an application to specific hardware.

"During a disaster, you never want to hear that 'it' is out of contract," says Cookson Electronics' Bremerman. "Some of our applications were out of maintenance and we had to pay a premium to get them back on maintenance to get a valid SPC code."

Furthermore, vendor contact information needs to be up to date and you must ensure that the hours of support are in line with RTOs. "Getting hold of our vendors during the night and on the weekend was a huge challenge," laments Bremerman. "It took about 30 hours to get a good SPC code from JD Edwards."

This was first published in September 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: