Ever since 9/11, we’ve found an increasing emphasis by top management and government regulators on asking disaster planners to demonstrate that their plans will actually work.
For an organization with even a limited amount of complexity this “show me” requirement can seem overwhelming — in terms of cost, disruption, and time expended. However, by dividing the task into four levels of increasing difficulty, it is possible to meet that requirement while minimizing disruption.
Basically, there are four kinds of tests available for a contingency or disaster recovery plan: (1) the blink test; (2) audit assessments and structured walk-throughs by “independent experts”; (3) component tests; and (4) “pull-the-plug” exercises.
This is often linked with the disaster recovery training cycle. Each task within the plan is first assigned to a specified employee who is then asked to sign a statement saying that they have read and understood the plan as it applies to them and that they are able to carry out their assigned roles within the plan, noting any limitations they may have in carrying them out.
At that point, we’ve discovered that people begin to speak up and say, “Wait a minute. I’m not authorized for that,” or “I don’t retain that information,” or even “I have family commitments that preclude that.” The response has to be, “What do we change in the plan to get you to sign the statement?”
Audit Assessments / Structured Walk-Throughs
The plan is next reviewed by both internal and external experts. Internal experts can include personnel from outside departments who are familiar with how the areas under evaluation operate. These employees are asked to walk through the various scenarios covered by the plan and to provide independent comments, based on their expertise and familiarity with the daily ebb and flow of the specific operations.
To obtain reviews from external experts, representatives of the planning team should be participating in, and to the extent possible, presenting their plan components at region-wide contingency management organizations such as NEDRIX (New England Disaster Recovery Information Exchange), the Business Recovery Managers Association in California, the Business Continuity Planners Association in the Midwest, or others.
Disaster recovery plans involve many components that can be tested independently. Of course, when the disaster strikes, these components must all work together, but if independent components can be shown to work by themselves, they can be counted on to do their part when the crisis occurs.
Among the specific components that can be independently tested are the recovery and re-installation of backup files; after-hours emergency notification of employees and suppliers; emergency generator operation; and building evacuation procedures.
Component testing also includes simulating a disaster at a single site for organizations that have many locations. By taking one small office offline or relocating it temporarily to its backup site, the department could flush out many problem areas in the transition from normal to crisis-mode and back to normal again. Later on, the plan could be further tested at larger offices.
Finally, it is necessary to resolve the question of whether or not all the various plan components can actually work together when they have to. This basically requires a “pull-the-plug” test, in which the entire organization is taken down and then re-opened and operated at alternate sites.
For most organizations, this is simply too disruptive to actually carry out. However, real life often intervenes to make it happen anyway. In those cases, when a mini-disaster happens, planners need to document the events in detail as if it were a test so that afterwards they can assess the following issues:
- exactly what happened to cause the crisis;
- what damages occurred as the crisis unfolded, following the causing event;
- what had been the planned responses to the situation;
- what actions were actually taken by the personnel affected and the responding personnel;
- with the benefit of hindsight, what should have been the responses of the personnel affected and the responding personnel;
- what was learned for the future – what worked and what didn’t;
- how should the disaster plan be modified;
- how should the disaster plan modifications be communicated to all personnel
It is crucial to remember that this testing process is always a work in progress. It needs to be repeated on a regular, ongoing basis, with continual documentation and feedback to all involved.
Steven Lewis, Ph.D. is the Editor-in-Chief of Edwards Disaster Recovery Directory. He is a Certified Information Systems Auditor (CISA) with a PhD in Systems from the Univ. of Pennsylvania, and a Masters and Bachelors in Engineering from Cornell University.
During the last 15 years, he has developed over 120 comprehensive disaster recovery/business continuity plans for networked-based organizations. All of these plans were subject to review by Regulatory Agencies and were all approved. Many of these also included “Year 2000” risk analyses, evaluation and testing.
Dr. Lewis has also authored numerous articles, including “Plan for a Disaster Without Destroying Your Budget,” which appeared in Public Risk magazine, and “Disaster Recovery Planning: A HIPAA Requirement,” which appeared in Health Facilities Management magazine. He can be reached at [email protected]
This article was first published on EnterpriseITPlanet.com.