Editor's note: When preparing your IT group to avoid the effects of a potential disaster, one critical component you can't overlook is the need to create and staff recovery teams. While this task encompasses several IT factions, the authors maintain it goes well beyond that. This excerpt shares more about the 12 types of teams you should identify and staff ahead of time, as part of efforts to develop an overall
disaster recovery plan.
Recovery teams should include the people who know their areas the best, without regard for ego, but rather with regard for appropriately speedy recovery from the disaster. Your organization may need more recovery teams than we suggest below, or you may find that some of the teams we have identified are not appropriate.
At a minimum, team leaders for each appropriate group must be identified, and their management as well as the individuals themselves must agree to the selections. Once the team leaders are in place, then the rest of the teams must be staffed; the responsibility for team staffing usually falls to the team leaders. It is inappropriate for any one individual to be an active member of more than one recovery team.
The following are the 12 recovery teams that we have identified:
- Disaster Management Team. This group oversees and coordinates the entire DR effort. The recovery coordinator will likely head this team, which should be staffed by representatives from all key business and operations areas around the company. During a recovery, all of the other recovery teams will report in to this team on a regular basis.
- Communications Team. This group includes representatives from the public relations group, if one exists, and is responsible for all communications with executive and upper management, the press, FEMA, the police, the employees and their families, and any other organizations that wish to speak with yours. Any communication requests that come in to other parts of the company must be redirected to this recovery team.
- Computer Recovery Teams. These teams work together and separately to restore all of the computer-related functions in an organization, after a disaster. An organization may have more or fewer teams, combining some and eliminating others. These teams will work very closely together throughout the recovery to make sure that, in the end, all systems, networks, storage, applications, and data are restored.
- System Recovery Team. This team will be broken into several subteams, one for each system platform found across the organization. A typical modern organization might have a Windows team, a Solaris team, an HP-UX team, and a mainframe team. There should be a single leader to whom all system recovery teams report. This team's goal is to get all critical systems back online after a disaster.
- Network Recovery Team. This group is responsible for getting the LANs and WANs, and all of the associated equipment (routers, switches, hubs, etc.) back online at the DR site.
- Storage Recovery Team. These people are responsible for restoring the SAN to service (if one exists), and to make sure that all disk and tape drives are accessible by their systems after the system recovery team has completed its job.
- Applications Recovery Team. Once the system and storage teams have completed their tasks, the application team takes over and works to get databases and other applications back online.
- Data Management Team. This group is responsible for getting data restored to the systems where it is needed, and for physically managing backup tapes. They are responsible for making sure that the tapes make it to the DR site after the disaster.
- Vendor Contact Team. This team is responsible for contacting all of the enterprise's vendors and maintaining that contact as necessary throughout the recovery period. All requests for vendor contacts from any of the computer recovery teams, or any other recovery team, should be funneled through the vendor contact team.
- Damage Assessment and Salvage Team. Depending on the nature of the disaster, this team might spend its time back at the site of the disaster, trying to determine the state of the original site, or trying to salvage any equipment or data that might be salvageable. This group would also be responsible for signage, and for roping off areas of the facility that may be dangerous or inaccessible.
- Business Interface Team. Lest we forget about the business (or primary function of the enterprise), a team must be designated that is responsible to the business itself to make sure that their needs are being met and their priorities are being addressed by the recovery efforts. If the business is working to bring back a particular group first, then it is important that the business interface team be made aware of it so that the efforts in IT match the business's efforts. At the same time, this team makes sure that the business folks don't interfere with IT's efforts. The latter point is especially true when the business people are former engineers who think they still know everything.
- Logistics Team. All of the details that are part of the recovery effort are handled by the logistics team. They make sure that there is food, water, space to work, and telephone service, and they handle all of the other unimportant tasks that would cause the entire recovery to fail if they were overlooked.
For more information:Tip: DR planning phase one: Preliminary risk analysis
Tip: DR planning phase two: Critical design tasks
Tip: No one-size-fits-all data protection solution
Content in this tip has been excerpted by permission from the book, "Blueprints for high availability, Second edition," authored by Evan Marcus and Hal Stern, Wiley Publishing, Inc. All rights reserved.
About the authors: Evan Marcus is a frequent SearchStorage.com contributor and an expert at answering readers' questions related to availability, backup and disaster recovery-related issues. He is also a principal engineer for Veritas Software and the industry's data availability maven, with over 12 years of experience in this area. He is also a frequent speaker at industry technical conferences.
Hal Stern is the vice president and chief technology officer for the Services business unit of Sun Microsystems. He has worked on reliability and availability issues for some of the largest online trading and sports information as well as several network service providers.
Do you have a question for Evan Marcus? You can find him in our High Availability category.
This was first published in March 2004