This is the third of four parts of a mini-tutorial covering the disaster recovery planning process, covering the final phase, which you can refer to as Plan Implementation.
At this point, the lion's share of the work involved in defining recovery capability requirements and identifying strategies that will comprise the recovery capability has been completed. Contracts may have been signed with a hot site vendor, or a homegrown backup data center and user work area has been outfitted with the necessary computing, networking and storage infrastructure, to carry you through a facility-wide shutdown. More importantly, data copies are now being completed and verified as a matter of routine.
However, for all your hard work, the tasks that remain to be done are among the most important in the planning project. For one thing, they ensure that your plan can be activated and used when necessary. For another, they create a constant process for plan improvement and enable organizations to ensure that their recovery capability is synchronized with their requirements.
The implementation phase involves the drafting of a plan document, or at least a decision-making flowchart that will walk a business or technical manager through the steps that must be taken in immediate response to an emergency. Get it out of your head that there exists a useful boilerplate plan after which you can model your own. Every good plan is unique. The best are succinct and provide less detailed descriptions of tasks than you might suspect. That's because the paper plan is almost never referenced in an actual emergency. Moreover, it is not a script to disaster recovery, because disasters have a way of being a lot messier than you ever anticipated when writing the scenario on which your plan is based.
Why go to the trouble of doing a paper plan (or an electronic one) at all? It's a good question. The answer is that your plan document is intended less as a script for recovery than as a training guide and testing manual.
After developing the disaster recovery capability you should carefully document what needs to happen when you need to make those people who will likely play a role in recovery aware of what their roles will entail. There are no solid guidelines on how many teams you will need or how many people need to be part of those teams. Common sense is your best guidance. A good strategy is to ask business unit or departmental managers who they believe would need to be involved in a recovery effort: select that person and a backup. IT will clearly need to have a significant presence.
Training consists of presenting strategies and procedures in a classroom environment. Train teams in what they will need to do in an emergency, but also provide sufficient context so they can understand how their work will impact or be impacted by the work of other teams. Take their feedback to heart, as they might identify requirements that you have overlooked.
Teams are also involved in testing, which is as much a training exercise as it is a rehearsal of plan activation. There are many ways to test and entire books written about the subject by DR planning consultants. I prefer non-disruptive testing -- that is, tests that do not interfere with actual day-to-day operations. Testing by unplugging a server from power or from its storage is almost certain to bring about a career disaster.
One test strategy is a walkthrough. Teams are brought into a common environment and a scenario is set forth. Then teams identify what they will need to do and speculate about the possible obstacles they may face and what may be needed to surmount them in terms of resources or time. Document the entire process.
Another strategy is to utilize the testing time you are granted if you subscribe to a recovery facility or hot site. Try to recover your data and system and network operations and ensure that the necessary resources are available and allotted time is sufficient. The outcome of this test should be documented as well since it provides direct feedback on the continuing efficacy of the plan and where it may have fallen out of synchronization with changes in business processes or IT infrastructure that occur on an almost daily basis.
Keep in mind that there is no pride of authorship in disaster recovery planning, just as there are no failed tests. Planners must keep the plan up-to-date and that activity involves both a testing regimen and a change management system. The latter may be as simple as a system of email notifications that will be made to the Planner when team members leave the company or change jobs, when new business units are formed or existing business units change procedures, and of course when IT changes infrastructure.
Testing, training and change management are where the rubber meets the road in DR planning. In addition to all of the value of these activities described above, they also provide a mental rehearsal that will aid personnel in responding rationally in the face of the great irrationality that is a disaster. As Shakespeare once wrote, "All things are ready if our minds be so."
In the final installment of this mini tutorial, we'll look at the role of suppliers, vendors and integrators in the planning process.
Did you miss other columns in this series? Check here for other Backup/DR tips or read Phase one or Phase two in this series.
About the Author: Jon William Toigo has authored hundreds of articles on storage and technology along with his monthly SearchStorage.com "Toigo's Take on Storage" expert column and backup/recovery feature. He is also a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including Disaster recovery planning: Preparing for the unthinkable, 3/e. For detailed information on the nine parts of a full-fledged DR plan, see Jon's web site at www.drplanning.org/phases.html.