Effective Date: July 03, 2003
Expiration Date: August 03, 2023
|| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | AppendixA | AppendixB | ALL ||
4.1.1. A Continuity of Operations Plan will be developed to assist an organization in preventing and responding to events that might disrupt mission-essential operations and services, where possible, and to minimize the potential impact of any unavoidable disruption.
a. A Continuity of Operations Plan recognizes the possibility that individuals may execute resumption and recovery operations with limited prior exposure to or knowledge of the entire plan in detail.
b. The Continuity of Operations Plan's development should focus on the following issues:
(1) Heightened awareness of management and employees.
(2) Resumption of essential operations.
(3) Advance preparation to minimize impact potential.
(4) Training in the execution of predefined and preassigned responsibilities and tasks.
(5) Incorporation, when appropriate, of existing contingency plans for IT resources as required under OMB Circular A-130.
4.1.2. The Continuity of Operations Plan will include the strategies, actions, and procedures established to resume mission-essential operations.
4.1.3. The Continuity of Operations Plan should also contain a statement of management policy. It identifies the plan's objectives, its scope and limitations, the assumptions made during its development, and guidelines for administering the plan's contents.
4.1.4. A sample Continuity of Operations Plan format is provided in Appendix C.
4.2.1. To assist the affected organization in resuming mission-critical, time-sensitive business operations and services, its technology, and its support operations in a timely and organized manner in order to continue as a viable and stable entity.
4.2.2. The primary objectives of a Continuity of Operations Plan are to--
a. Provide a tested vehicle which, when executed, will permit and support a safe, efficient, timely resumption of the interrupted essential operation(s).
b. Identify those mission-essential operations that, by nature of their criticality, cannot be disrupted under any circumstance.
c. Ensure the continuity of the essential operations or services provided from the affected facility and organization.
d. Minimize inconvenience and potential disruption to other operations.
e. Minimize the impact to NASA's public image and adverse financial effects of an event.
f. Resume technology operations and communications support for mission-critical and time-sensitive NASA essential operations in the event existing operations have been rendered inoperable.
g. Reduce operational effects of a disaster on NASA mission-critical and/or time-sensitive essential operations through a set of predefined and flexible procedures to be used in directing recovery operations.
h. Resume production processing of the most time-sensitive IT systems, network services, communications, and applications within (e.g., immediate, 8 hours, 24 hours), following the disruptive event.
i. Resume production processing of less time-sensitive IT systems, network services, communications, and applications within (e.g., 5 to 30 calendar days),following the disruptive event.
j. Resume full operational capability, including test and development work, for technology and operations within (e.g., 30 to 45 calendar days),following the event as permitted by the restoration effort.
k. Resume and maintain adequate service levels to supported organizations.
l. Provide a proper work environment for displaced staff while the facility and its contents is being restored.
m. Ensure that normal operations are restored in a timely manner.
n. Provide the organization with a viable, well-maintained plan.
4.2.3. A Continuity of Operations Plan also seeks to minimize the following:
a. The number and frequency of "ad hoc" decisions that will be made following a disaster.
b. An individual organization's dependence on the participation of any specific person or group of persons.
c. The need to develop and implement new procedures once the disaster has occurred.
d. The loss of vital data and information, recognizing that some loss is inevitable.
e. Confusion and exposure to errors, omissions, and unnecessary duplication of effort.
f. The total elapsed time to execute response, recovery, and restoration processes.
4.3.1. The scope of the Continuity of Operations Plan will include mission-essential, time-sensitive, and less time-sensitive operations, supporting IT, and other supporting infrastructure.
4.3.2. The Continuity of Operations Plan will be activated in the event that an essential function, or a portion of it, is involved in an emergency, or is declared unable to be performed in its primary location with primary support infrastructure (e.g., staff, data and communications systems, utilities, facility, furniture).
4.3.3. The Continuity of Operations Plan will address resumption and recovery of essential operations, in a disastrous event situation. It should not separately address building emergency and evacuation procedures or onsite resumption and recovery procedures, but should incorporate or reference existing emergency-preparedness.
4.3.4. Actions related to the physical restoration process, in terms of primary site restoration, recovery deactivation, migration and reestablishment of normal operations, termination and shutdown of recovery operations at alternate sites, and postrecovery operations, will be addressed in individual continuity team tasks.
4.3.5. The Continuity of Operations Plan will be based on NASA Center management knowledge, review and approval of those mission-critical essential operations, applications, and associated support operations identified as time and/or mission-sensitive.
a. The time sensitivity of the essential operations and support activity performed by the organization will be documented during the preplanning process known as a business plan analysis outlined in chapter 3, paragraph 3.3.1.
b. The business plan analysis identifies the time-sensitive, mission-essential operations, IT requirements, time-sensitive support operations, and tolerable outage periods, where appropriate, for which and after which disruptions could result in significant losses to NASA.
c. The resulting application of recovery priorities, on which the Continuity of Operations Plan is based will be documented in a report Essential Processes by Criticality to be included in Continuity of Operations Plan appendices.
4.4.1. NASA activities take place at different locations, and consequently, potential emergencies may be varied.
4.4.2. Site plans for NASA Centers and details on the types of emergencies that each NASA Center could expect to face are contained in the individual Center Emergency Preparedness Program Plans.
4.5.1. A NASA mission-essential infrastructure asset (e.g., function, facility(ies), system), or other essential resource(s), are totally unusable or inaccessible, and there is no salvageable equipment, data, or documentation.
4.5.2. The Continuity of Operations Plan is designed for "worse-case" scenarios and depends, to a large degree, on the ability to resume operations from less serious interruptions through the activation of contingency plans established under Agency emergency-preparedness documents or those developed under OMB Circular A-130 for IT resources.
4.5.3. In circumstances involving a localized event (i.e., limited to a single facility and system) equipment vendors and local utility companies should normally be able to install replacement IT and communications hardware and telephone circuits in '1' to '5' calendar days. This assumes that replacement service and equipment orders are placed on an "emergency" basis at the time of the event. It also assumes that the individual NASA Center or facility can quickly obtain and prepare suitable alternate site(s) to serve as an interim temporary resumption and recovery activity for its business operations and information processing centers, in a period of 3 to 5 days.
4.5.4. In the event of a regional emergency, such as an earthquake or a tornado, it could take weeks to acquire the necessary equipment and data circuits. This will be due to multiple organizations contending for the same emergency resources and services. Regional emergencies which cause wide-spread disruption of public utilities such as electricity, water, and network services may also cause additional delays in reestablishing NASA business and technology operations without preidentified, preconditioned, and contractual alternate backup sites.
4.5.5. That NASA Centers and organizations will have access to and use of sufficient physical sites within the NASA environment to meet its resumption and recovery time objectives. Sites currently considered eligible temporary recovery locations are listed in the individual Continuity of Operations Plan as an appendix. The repositioning of redundant equipment and operational capability, environmental conditioning, and access to the NASA WAN which may be necessary to accomplish recovery actions is also addressed in the Continuity of Operations Plan as an appendix.
4.5.6. Level of documentation in the individual Continuity of Operations Plan assumes and requires that NASA management and staff are familiar with the Center's or organization's business operations, its supporting resources configuration, and the requirements of the Continuity of Operations Plan.
4.5.7. Sufficient management and staff, familiar with and trained in the procedures and tasks in the individual Continuity of Operations Plan, will be available subsequent to the interrupting event to execute their recovery responsibilities and to support the restoration effort. NASA personnel understand that, following a major interruption of essential operations, it will not be a matter of "business as usual" but "survival."
4.5.8. All vital business documentation and files necessary for resumption and recovery purposes are backed up and stored and located safely away from the critical facility(ies) using a rotation schedule that minimizes the data loss.
4.5.9. All vital electronic data files required to implement resumption of the current operating environments, and/or that support time-sensitive essential operations are backed up daily.
a. When appropriate, this information is rotated to a safe offsite location according to a schedule that minimizes data loss and the effort to reconstruct production environments.
b. The type of backups and the timing of the offsite rotation and retention are approved by NASA management and are considered sufficient to minimize the reentry and reconstruction of data and the re-creation, forward recovery of files to current status.
4.5.10. All vital backup items for resumption and recovery are stored onsite and offsite or can be easily and quickly obtained or created from other identified sources.
a. The backups stored onsite are in a series of fire resistant safes that are located within the Center and organization boundaries.
b. The backups stored offsite are in a secured location that is sufficiently distant from the primary site so they would be unaffected by most interrupting events.
c. These stored backups are considered to be the only resources available to implement resumption.
d. The Continuity of Operations Plan assumes that locations where backups are stored were not affected by the emergency incident or situation and can be accessed by NASA personnel.
4.5.11. All information necessary to complete the internal and external contacts quickly and accurately during resumption is documented and maintained in the Continuity of Operations Plan.
4.5.12. The timeframe in which each time-sensitive essential function, supporting activity, and IT system has been set is current with the needs of clients and is available within the Continuity of Operations Plan.
a. The resumption of each essential function is greatly dependent on the availability of appropriate staff, its information, communications, and its access to the IT systems and data files, if required.
b. Actual timeframes for resumption and recovery may be influenced by the availability of staff, alternate operating sites, hardware and software, current backup files, and the reload time requirements of the IT system architectures.
4.6.1. Loss of functionality of essential operations (e.g., facilities, systems, other interdependencies) at a NASA location may have a significant impact on data delivery at the Center and organizational level, and in some instances, throughout NASA.
4.6.2. The Continuity of Operations Plan will be developed to respond effectively to a significant event by using a predefined method for utilizing various facility, staff, and technical resources. This method, known as the recovery strategy, is employed to help ensure that an affected organization can accomplish the resumption and recovery of mission-essential operations within stated timeframes at required levels of service.
4.6.3. Consideration will be given to selecting a recovery strategy that is workable as well as cost efficient.
4.6.4. A COOP recovery strategy should anticipate the availability of other NASA and Federal agency locations for use as alternate operation sites. (A prioritized list of eligible locations should be provided as an appendix to the Continuity of Operations Plan.)
4.6.5. Alternate sites should be selected for their ability to support physical and technical infrastructure requirements while providing the best possible access to essential communications resources (e.g., telephones, WAN/LAN), as necessary to meet essential requirements.
4.6.6. As appropriate, the Continuity of Operations Plan will ensure planning for continuity teams including relocating to selected alternate sites and preparing them for use as alternate operating sites should an event require activation of the Continuity of Operations Plan.
4.6.7. Where the Continuity of Operations Plan provides for prepositioned equipment and services, COOP teams will activate those resources as necessary.
4.6.8. Where additional equipment and other services are needed to upgrade a site to full utilization, these items will be acquired and installed on an emergency basis.
4.6.9. Configuration details for current servers, and network management devices are included in the Continuity of Operations Plan to help expedite this "acquire time of disaster" strategy as are the current inventory of NASA contracts for which emergency requisitions will be drafted (included in the Continuity of Operations Plan appendices).
4.7.1. The scope of administrative duties and responsibilities includes, but is not limited to, the continued endorsement of the Continuity of Operations Plan by affected program management, through mandatory, documented review of the Continuity of Operations Plan by appropriate management personnel and team members, on no less than an annual basis.
4.7.2. A report on the plan's administration, prepared by the responsible program or project management, will be reviewed and approved by the Agency Senior Management official responsible for the COOP program, annually or as otherwise required.
4.7.3. The affected Program Manager, or his/her designee, is responsible for administration of the plan.
a. He/she will ensure that NASA standards and procedures are developed to address COOP administrative needs.
b. He/she will also include any relevant, related documentation in the plan.
c. As custodian and administrator of the Continuity of Operations Plan, he/she will have a thorough knowledge of all plan contents.
d. As a further safeguard, he/she should never be the sole person in the organization with extensive knowledge of the structure and contents of the plan. An alternate COOP coordinator will be a full participant in all plan maintenance and exercise activities.
4.7.4. Responsibility for maintaining specific sections of the Continuity of Operations Plan resides with each COOP Team Leader in accordance with the Team's objectives and functional responsibilities for Prevention, Response, Resumption, Recovery, and Restoration.
a. Team leaders should ensure compliance with these documented procedures for plan administration.
b. Each employee, regardless of their role as a COOP team member, is responsible for providing updated personal contact information to the responsible Program or Project Manager, as changes occur.
4.7.5. Each employee is responsible for the maintenance of the affected organization's capability to respond and resume essential operations following a disaster.
a. Some individuals will have more direct responsibility than others will.
b. Each individual should be aware of the necessity for the preservation of such a continuity capability and should ensure that the Prevention, Response, Resumption, Recovery, or Restoration capability is truly viable.
c. Should a plan review necessitate changes or updates, the COOP Coordinator is responsible for implementing the changes and issuing updated plan documentation.
d. Individuals in responsible management positions will be called upon periodically to provide information necessary for maintaining a viable plan and an exercised continuity capability.
4.8.1. This section outlines the four major stages of the continuity process as it applies to development and maintenance of a Continuity of Operations Plan.
4.8.2. It describes the central activities and objectives of each stage and the relationships between stages.
4.8.3. Actual circumstances of the business interruption disaster will determine whether a particular stage is initiated and how long it will take to complete.
4.8.4. This section provides guidelines and explains the continuity process.
4.8.5. Emergency Response
Following the notification of the emergency incident or situation, and in accordance with Agency and individual Center Emergency Preparedness Response Plans, a team of key COOP personnel, the COOP Assessment team, will first assemble at the incident site, or other staging area if the incident site is deemed unsafe, contaminated, or otherwise unsuitable for use, and begin to assess and evaluate the site.
a. Primary objectives of the COOP Assessment Team are:
(1) To operate safely and efficiently.
(2) To establish an immediate and controlled presence at the incident site, as allowable per instructions of the onscene incident commander in accordance with Center Emergency Preparedness Response Plans.
(3) To conduct an onsite (and in some instances "standoff") assessment of the incident impact, known injuries, extent of damage, and disruption to the facility(ies), services, and business operations.
(4) To determine if and/or when access to the facility(ies) will be allowed.
(5) To provide the appropriate management team with the facts necessary to make informal decisions regarding subsequent recovery activity.
b. Response to an emergency does not necessarily or automatically translate into the declaration of a disaster and the implementation of a COOP.
c. Activation of the disaster recovery portion of the Continuity of Operations Plan requires significant expenditures of time, personnel, and financial resources. The appropriate affected program management team will determine whether or not the expenditure of resources are warranted and to what extent they are justified, based on the information and recommendations provided by the Assessment Team.
4.8.6. The appendices of the Continuity of Operations Plan will contain up-to-date contact lists, team assignments, checklists of specific tasks to be performed, and copies of any individual contingency plans developed under OMB Circular A-130.
4.9.1. Initial notification of an incident or situation is generally expected to come directly from an affected organization staff member or the discoverer of the event. Other potential sources of incident notification might be law enforcement, security, fire department, other Center emergency response personnel, and news media.
4.9.2. The Continuity of Operations Plan will provide instructions for the proper and timely notification of an emergency situation, including notification to Center management, Center Emergency Response management personnel, and appropriate Headquarters personnel per requirements of NPD 8710.1, Emergency Preparedness Program, and NPR 8715.2, NASA Emergency Preparedness Plan Procedural Requirements.
4.9.3. The Continuity of Operations Plan will provide specific instructions and guidelines for contacting members of the continuity management team and the various response teams. Notification procedures will also include requirements for maintaining records of all notifications made.
4.9.4. Notification Guidelines
18.104.22.168 The individual Continuity of Operations Plan will include instructions and guidance for the following:
a. All team leaders and team members should be assigned call tree responsibilities that will be followed during the emergency notification. The appropriate Center Director will determine if the facility should be declared a disaster, based on a preliminary assessment of the situation.
b. If emergency notification procedures are initiated, each team leader will be responsible for contacting their alternate team leader and team members with specific instructions.
c. If the team leader is not available, the alternate team leader will assume the team leader's responsibilities.
d. In the event the alternate team leader is also not available, the management team will assign someone to complete the notifications until the primary or alternate team leaders become available and resume their responsibilities. It is important that all key personnel be notified of the disaster as soon as possible to begin resumption of essential operations.
e. An Employee and Contractor Notification List will be developed and maintained that has the telephone numbers of essential personnel to be notified in a predetermined sequence.
4.9.5. Objectives of the Continuity Organization During Emergency Response
22.214.171.124 The objectives for the continuity organization during emergency response are as follows:
a. Complete emergency response, notification, and mobilization duties as directed by the COOP Management Team.
b. Ensure that the COOP Management Team is contacted and apprised of the emergency situation's status and activity.
c. Obtain situation reports of personnel injury, damage, and other related matters from Center Emergency Response management personnel.
d. When permitted to do so by Center Emergency Response authorities, perform assessment(s) and evaluation(s) until the extent of impact or damage can be reasonably determined.
e. Document the results of preliminary assessment(s) and evaluation(s) and submit the report to the COOP Management Team with recommendations to terminate the emergency response activities or activate subsequent plan operations.
f. Terminate, expand, or extend the operation as directed by the COOP Management Team.
4.10.1 COOP planners will ensure that the organization's Continuity of Operations Plan contains specific guidance for the following activities:
4.10.2. Establishing and organizing a Command Center from which to manage resumption activities.
a. This Command Center may be collocated with the Center Emergency Operations Center (EOC) if activated by Center Management.
b. If the disastrous event is of a scale that impacts the entire Center, multiple COOP activity may be implemented.
c. Refer to the Center Emergency Preparedness and Response Plan for guidance.
4.10.3. Activating and mobilizing the continuity teams needed to resume time-sensitive restoration activity.
4.10.4. Evaluating alternate site equipment and network service for the necessary enhancement to support time-sensitive application recovery.
4.10.5. Mobilizing and activating the support teams needed to support enhancement and use of the alternate site(s).
4.10.6. Notifying and informing clients and NASA Senior Management of the situation.
4.10.7. Alerting employees and contractors not assigned to the continuity organization, vendors, and other key organizations to the situation and their role, if any, during resumption and recovery.
4.10.8. Once mobilized, the support teams will be instructed in their reporting and action requirements. The necessary site assessments, evaluations, and the initiation of salvage operations will be completed once the Command Center is established. Additional alerts to supporting vendors, management, and customers will also be conducted from the Command Center.
4.10.9. Based on the information and recommendations provided by the Assessment and Salvage Team, the COOP Management Team will determine whether or not the expenditure of resources are warranted, to what extent they are justified, and what other actions will be taken.
4.10.10. Objectives of the resumption stage. That will become the major focus of the resumption stage are:
a. To prepare for and/or implement the procedures necessary to facilitate and support the resumption process and subsequent restoration operations, as required.
b. To mobilize and activate the continuity teams responsible for reestablishing essential operations and functions.
c. To alert employees, vendors and other internal and external individuals, and organizations.
d. To begin implementing procedures to restore and establish time-sensitive processes and applications. This may include relocating to a temporary facility, reestablishing communications at an alternate site, or activation of a redundant site.
4.11.1. A Command Center will be established if management decides to continue and escalate the situation from emergency response to resumption operations. The site for the Command Center should be identified in advance. Initial activities performed at the Command Center are described below.
a. If the facility can be accessed, further assessments and evaluations of the onsite conditions, the damage impact and extent of the emergency incident or situation will be completed.
b. Use of the command center may be confined to management meetings and the cancellation of resumption operations if the facility (e.g., work areas, fixed assets, files, equipment, voice communications) are unaffected and the emergency incident or situation problems can be resolved without major impact to the critical operations.
c. If the information about the emergency incident or situation problems is inconclusive, the Command Center will be used as a meeting site until the assessments are completed.
d. If the emergency incident or situation is such that the resumption operation needs to be continued or further escalated, and/or a disaster declared, the Command Center should be organized and the appropriate support and resumption teams notified and activated as required.
4.12.1. The recovery stage of the continuity process concerns the reactivation of a greater scope of operations and services beyond the most time-sensitive operations.
a. Management, through development and implementation of the COOP will initiate recovery-stage operations if the estimate of total outage indicate the need to expand service delivery using alternative locations and resources.
b. If, for example, the impact on the facility is expected to take more than 30 days to resolve, the recovery stage may be initiated at alternative site(s) and the appropriate resources devoted to those applications.
c. Alternatively, if it is estimated that 15 days would be needed to restore to full operations, organization management might initiate a parallel effort to resume less time-sensitive operations at another site while planning the migration of resumption activities from the alternate site(s) to the primary facility.
d. Consequently, recovery, resumption, and restoration stage activities may be conducted with some parallelism as dictated by the situation.
4.12.2. Objectives of the recovery stage operations include:
a. Maintaining a Command Center, which provides sufficient direction and support for resumption and recovery operations.
b. Mobilizing and activating additional continuity teams to facilitate the recovery of less time-sensitive operations.
c. Maintaining an adequate level of support team coverage to support all operations.
d. Maintaining an adequate level of technology team coverage to sustain information processing service demands as they grow in scope.
e. Maintaining communication with the continuity organization, clients, and Senior Management.
4.12.3. Command Center During the Recovery Stag.e
126.96.36.199 The level of support maintained at the Command Center during recovery will be determined by the Management Team based upon--
a. Scope of the disaster,
b. Number of essential operations affected,
c. Level of support required for the recovery of essential operations, and
d. Perception of ongoing risks and/or exposures.
4.13.1. When Center Emergency Response officials allow access to the facility, the COOP Management Team will initiate the restoration phase of the COOP.
4.13.2. The restoration stage builds on the assessments performed in the emergency response stage with the goal of returning the impacted facility to its predisaster capabilities. In circumstances where the original facility was assessed as beyond repair, this stage will involve the acquisition and outfitting of new permanent facilities.
4.13.3. The Restoration process will include the assessment of--
a. Environmental contamination of the affected areas,
b. Structural integrity of the building, and
c. Damage to furniture, fixtures, and equipment.
4.13.4. Restoration will begin when reliable estimates of contamination, structural damage, and asset loss can be obtained and personnel resources can be dedicated to the management and coordination of the process. This phase may be executed sequential to, or concurrent with, the resumption and/or recovery stages.
4.13.5. Objectives of the Restoration Stage
188.8.131.52 In addition to maintaining a Command Center that provides sufficient support for resumption and restoration operations, objectives of the restoration stage are to--
a. Maintain an adequate level of support team coverage to support all operations,
b. Maintain an adequate technology teams coverage to sustain information processing operations, when required,
c. Maintain communication with the continuity organization,
d. Clean and/or decontaminate the facility,
e. Repair and/or restore the facility or construct or acquire a new facility,
f. Replace the contents of the facility, and
g. Coordinate the relocation and/or migration of business operations (e.g., personnel, equipment) from temporary facilities to the repaired or new facility.
4.14.1. In the event of a disaster, the normal structure of the unit should shift to that of the continuity organization.
4.14.2. The affected organization will shift from the current "business as usual," structure to an organization working towards survival and the resumption of time-sensitive essential operations.
4.14.3. The teams associated with the COOP represent units and/or support operations organized to respond, resume, recover, or restore essential operations of the affected facility. See Figure 2, chapter 3, for a representative organizational chart of a typical continuity team.
4.14.4. Each team is comprised of individuals with specific responsibilities or tasks that should be completed to fully execute the COOP.
4.14.5. A primary and alternate team leader who is responsible to the affected Program Manager, or his/her designee, leads each team.
4.14.6. Each team is a subunit of the continuity organization.
4.14.7. Each team is structured to provide dedicated, focused support, in the areas of its particular experience and expertise, for specific response, resumption, and recovery tasks, responsibilities, and objectives.
4.14.8. A high degree of interaction among all teams will be required to execute the COOP.
4.14.9. Each team's eventual goal is the resumption and recovery and the return to stable and normal business operations and technology environments.
4.14.10. Each team leader will report status and progress updates to its management team throughout the continuity process.
4.14.11. Close coordination will be maintained with the appropriate management personnel and each of the other teams throughout the resumption and recovery operations.
4.14.12. The primary responsibilities of the continuity organizations are to--
a. Protect employees and information assets until normal business operations are resumed.
b. Ensure that a viable capability exists to respond to an incident.
c. Manage all response, resumption, recovery, and restoration activities.
d. Support and communicate with NASA Senior Management and other locations, as necessary.
e. Accomplish rapid and efficient resumption of time-sensitive business operations.
f. Ensure that all statutory and regulatory requirements are satisfied (e.g., environmental, records retention).
g. Exercise impact resumption and recovery expenditure decisions.
h. Streamline the reporting of resumption and recovery progress among the teams and with the affected program management team and NASA Senior Management.
4.14.13. During Emergency Response, the primary responsibilities of the continuity organizations are to--
a. Establish an immediate and controlled presence at or near the incident site and await instructions from Center emergency response personnel.
b. Determine if and/or when access to the facility will be allowed.
c. Upon being granted permission to enter impacted facility, conduct a preliminary assessment of incident impact, extent of damage, disruption to the affected organization's services and essential operations.
d. Provide Center Senior Management with the facts necessary to make informed decisions regarding subsequent resumption and recovery activity.
4.14.14. During Resumption, the primary responsibilities of the continuity organization are to--
a. Establish and organize a Command Center for the resumption operations.
b. Notify and apprise team leaders of the situation.
c. Mobilize and activate the operations teams necessary to facilitate the resumption process.
d. Alert employees, vendors, and other internal and external individuals and organizations.
4.14.15. During recovery, the primary responsibilities of the continuity organization are to--
a. Prepare for and/or implement procedures to facilitate and support the recovery of less time-sensitive operations.
b. Mobilize additional continuity teams and support organizations as required.
c. Maintain an information flow regarding the status of recovery operations among employees, vendors, and other internal and external individuals, and organizations.
4.14.16. During Restoration, the primary responsibilities of the continuity organization are to--
a. Manage salvage, repair and/or refurbishment efforts at the affected facility.
b. Prepare procedures necessary for the relocation or migration of operations to a new or repaired facility.
c. Implement procedures necessary to mobilize operations, support, and technology relocation or migration.
d. Manage the relocation/migration effort as well as perform employee, vendor, and customer notification before, during, and after relocation or migration.
4.15.1. Activation of the COOP should only be executed when an emergency occurs that necessitates a response beyond the scope of daily standard operating procedures. In accordance with Agency and/or individual Center Emergency Preparedness Program Plans, only the following selected personnel may activate the entire plan, or any phase thereof, and/or declare a disaster situation for NASA.
a. The NASA Administrator or designee may declare a NASA emergency.
b. The Center Director or his/her designee will decide whether or not to activate the respective organizational COOP and/or declare a disaster.
4.15.2. Their decision will be based on a preliminary assessment of the business interruption incident, including any physical impairment to the facility. Pending their decision, emergency notification of NASA personnel will be initiated, and the entire COOP, or any phase thereof, will be activated, as directed.
4.15.3. Technology teams focused on restoring communications, data, networks, will be activated only as directed by the management team. Each team consists of unique procedures, tasks, contact, and resource information. Programs and applications will be restored according to established priorities.
4.15.4. Organization restoration teams will be activated only as directed by the Management Team based on the impact of the disruption. Restoration priorities will be established in response to the disruption. Staff will focus on reestablishing essential office operations and ensuring that the restoration teams focus on communications, application, and program recovery priorities.
4.16.1. Following the response phase of the COOP, the affected organization will organize into teams to execute its resumption and recovery activities on behalf of NASA.
4.16.2. To accomplish the tasks assigned, each team will draw upon the expertise of supporting organizations, both internal and external, as necessary.
4.16.3. This section of the COOP will identify the major groups of teams required to accomplish recovery.
4.16.4. Each team has a minimum of a leader and one or more members representing the skills appropriate to the team's role.
4.16.5. Team leaders/alternates should be thoroughly familiar with the responsibilities not only of their team but also of all the teams with which they will interact.
4.16.6. A detailed list of teams and their current team members will be located in the Plan Implementation section of the COOP.
4.16.7. The roles and responsibilities of each major group of teams are outlined below.
a. Affected Program Management
(1) Approve the activation of the plan or the declaration of a disaster.
(2) Approve expenditures as required.
(3) Coordinate temporary relocation logistics with support services organizations.
(4) Coordinate with NASA Senior Management on the issuance of related news releases to the press and media.
(5) Monitor all activities with the Recovery and Restoration Management Teams.
(6) Provide Senior Management direction and counsel to activated teams as required.
(7) Coordinate all personnel matters and issues involving employee fatalities and injuries and notifications to employee's families and dependents with NASA management. This may also include professional counseling and financial support for employees.
(8) Review progress and status with Center and NASA Senior Management.
(9) Manage the resumption and recovery of all business operations and service delivery.
(10) Establish and organize a business resumption operation at an alternate site.
(11) Organize the business resumption Command Center.
(12) Direct and support team leaders and make assignments, as appropriate.
(13) Ensure that a damage assessment and salvage operation is conducted at the primary site.
(14) Control the activation of the business resumption procedures.
(15) Coordinate the eventual restoration and relocation of the primary site.
(16) Report resumption and recovery progress to NASA Senior Management.
b. IT Resources, Communications and Data Recovery Management
(1) Contact key personnel required for resumption of time-sensitive operations.
(2) Alert all personnel and instruct them to report to their designated areas, as required.
(3) Perform tasks to resume time-sensitive operations, as required.
(4) Work with support teams to obtain support required for task accomplishment.
(5) Report the status of resumption activity to management team.
(6) Manage all administrative activities associated with the resumption and recovery operations.
(7) Notify alternate backup sites and/or vendors of disaster declaration. Ensure that backup sites are prepared to accept staff for resumption of operations.
(8) Identify and coordinate procurement actions for equipment and services for alternate site installation, if not a redundant site.
(9) Identify and retrieve all backup files and other vital records from offsite or remote storage.
(10) Request and coordinate installation of data and telecommunications capability if necessary.
(11) Execute IT systems resumption procedures.
(12) Manage IT systems operations at the alternate and primary sites if necessary.
c. Organization Restoration Management
(1) Coordinate salvage and/or reconstruction of the affected facility, records, and file reports, as appropriate.
(2) Coordinate the acquisition and outfitting of a new permanent site, if necessary.
(3) Identify and coordinate procurement for equipment and services for the permanent site.
(4) Work with NASA support teams to obtain required services to restore and outfit a permanent office location.
(5) Manage preparation of a migration plan from the alternate site to the permanent site.
(6) Coordinate migration and move-in logistics with the affected management, IT Communications and Data Recovery teams, and with NASA support services.
4.17.1 The Program Manager, or his/her designee, will develop a reporting structure for the continuity organization that reflects the overall team organization and reporting requirements that will be employed during response, resumption, recovery, and restoration processes.
4.18.1. COOP maintenance procedures are divided into two general categories, scheduled and unscheduled. Scheduled plan maintenance is time-driven, where unscheduled plan maintenance is event driven.
4.18.2. Scheduled Plan Maintenance
a. Scheduled maintenance may consist of quarterly reviews and updates as well as annual structured walk-through and/or tactical exercises (as described in the Plan Exercise section of this document).
b. The purpose of the COOP review is to determine whether changes are required to the procedures, the continuity organization, and notification procedures.
c. The Program Manager, or his/her designee, is responsible for initiating scheduled maintenance activities in consultation with the Management Team.
d. The Program Manager, or his/her designee, shall initiate semiannual continuity plan reviews. He or she shall notify all continuity organization team leaders and alternate team leaders to review the response, resumption, recovery, and restoration task lists, contact information and procedures for changes that may be required.
e. Other organization staff members may be invited to satisfy the needs of a specific review session.
f. The reviews address events that have occurred within each team's area of responsibility that may affect the response, resumption, recovery, and restoration capability.
g. Teams shall submit changes to the Program Manager, or his/her designee, as they are needed. The Program Manager, or his/her designee, shall incorporate all changes to the COOP and distribute updated copies, as required.
4.18.3. Unscheduled Plan Maintenance
a. Certain maintenance requirements are unpredictable. The majority of unscheduled changes occur as the result of major changes to service level agreements, hardware configurations, networks, and production processing.
b. Examples of items that may trigger the need for unscheduled maintenance of the plan may include:
(1) Changes in data processing architectures, hardware, or environmental changes.
(2) Major changes in operating system(s) or utility software programs.
(3) Major changes in the design of a production database.
(4) Major changes in communications, systems network design, or implementation.
(5) Changes in offsite storage facilities and methods of cycling items.
(6) Improvements or physical changes to the current facility.
(7) Changes in the business or operating environment.
(8) Center and/or Enterprise organization changes that affect continuity teams.
(9) New application systems development.
(10) Discontinuation of an application systems from processing schedules.
(11) Transfers, promotions, or resignations of individuals on the emergency notification list or continuity organization teams.
(12) Significant modification of basic operations, data flow requirements, or accounting requirements within an application system.
c. The Program Manager, or his/her designee, should be made aware, in writing, of all changes to the COOP resulting from unscheduled maintenance.
d. The Program Manager, or his/her designee, shall then notify all continuity organization team leaders and alternate team leaders to review the COOP for changes that will be required as a result of the item that has triggered the review.
e. Team leaders will submit actual change data to the COOP Coordinator.
f. The Program Manager, or his/her designee, will team up with the person submitting the change and either update the COOP or assign the update responsibility to the affected continuity team(s). Cross-team coordination should be completed within 2 weeks of the review.
g. The Program Manager, or his/her designee, is responsible for any required updates to the Plan, which result from the review.
h. The Program Manager, or his/her designee, shall print hard copies of the Plan, and distribute as required.
4.19.1. The Continuity Plan maintenance process should include a periodic re-evaluation of the minimum staffing, technical support, and services required to provide short-term response, resumption, recovery, and restoration capability.
4.19.2. When IT support is integral to the continuation of essential operations, the reevaluation process will also address the capacity growth requirements associated with the increase of transaction processing volumes of the production application systems, as well as the addition of new systems to the production environment.
a. Based on the existing configuration and requirements, it is assumed that the most effective configuration for supporting long-term recovery and restoration will be the installation of the computer hardware required to support normal to near-normal levels of processing in a temporary environment.
b. Special attention is required to ensure continuing compatibility of existing equipment with that which is installed at the alternate site.
4.20.1. Documentation and periodic reviews of the organization COOP are useful. However, proof and confidence that the COOP will work only results from completion of a successful exercise or test of the tactical strategies and procedures. Exercises and tests of the individual organization and system COOP are designed to determine:
a. The state of readiness of the continuity organization to respond to and cope with a disaster involving mission-essential operations, facilities, and IT systems and other interdependencies,
b. Whether backed up vital data and records stored offsite are adequate to support resumption of essential operations,
c. Whether inventories, tasks, and procedures are adequate to support resumption of essential operations, and
d. Whether the organization COOP has been properly maintained and updated to reflect the actual resumption, recovery, and restoration needs.
4.20.2. Type and Scope of Exercises or Tests
a. A comprehensive program of exercises and tests varying in scope and level of detail will assist organizations in ensuring the effectiveness of their COOP.
b. Examples of the types of exercises and tests that may be incorporated into the organization training/exercise program are outlined below.
c. At a minimum, organizations having responsibility for COOP activity will test and document the COOP at least annually, using one or more of the suggested exercise types:
(1) Structured Walk-Through
a. In the structured walk-through, a disaster scenario is established, and COOP Teams "walk through" their assigned tasks.
b. This is a "role-playing" activity that requires the participation of at least the team leaders and their alternates.
c. The developed scenario will be made available in advance of the exercise to allow team members to review their assigned tasks in response to the exercise scenario.
d. During the structured walk-through, the COOP is checked for any errors or omissions.
e. At the end of the structured walk-through, any changes to the COOP that are found to be necessary are implemented.
f. This type of exercise can be conducted with or without an independent "monitor."
a. A tactical exercise is a simulated exercise, conducted in a "war game" format.
b. All members of the individual continuity team are required to participate and perform their tasks and procedures under announced or surprise conditions.
c. Participants include an exercise monitor or monitors, depending on the size of the organization.
d. The exercise monitor(s) provides information throughout the exercise to simulate events following an actual disaster.
e. Generally, a disaster scenario is established and is provided to all business operations continuity team leaders, alternate team leaders, and team members located in a large conference room, auditorium, or using video teleconferencing.
f. Each team executes its exercise objectives and interacts with other teams as they complete their actions.
g. A "speeded up" clock is usually employed in order to complete, at a minimum, 3 days' actions in 1-working day and requires the teams to respond to the scenario information in near "real" time.
h. An 8-hour exercise will usually simulate 48 to 72 hours of resumption activity.
i. As in the structured walk-through, the plan is checked for any errors or omissions.
j.At the end of the tactical exercise, any changes to the plan that are found to be necessary are implemented.
(3) Live Production
a. In a live production application systems exercise, an operating system is brought to live status on the alternate processing activity, and the data communications network is switched to the alternate site.
b. All resources, other than the operating platform and communications hardware needed to support this exercise, will be retrieved from the offsite storage facility, unless an alternate or redundant site is in existence.
c. This exercise continues to validate the switching capability of the data communications network, and then to the production of selected applications systems, including User Login and application system data currency checks.
d. A live production exercise will normally be conducted on a weekend when there is a lesser requirement to provide continued service to the user community.
e. Assurance of overall recoverability can only be achieved through the conduct of a complete Live Production Application System Exercise.
a. This type of exercise requires the execution of notification, operating procedures, the use of equipment hardware and software, possible use of alternate site(s), and operations to ensure proper performance.
b. Simulation exercises can and may be used in conjunction with "checklist" exercises for identification of required COOP modification and staff training.
c. Examples of procedures verified during a simulation exercise include emergency procedures, use of alternate methods, telecommunications backups, agent, vendor, customer notifications, hardware capacity and performance, software transportability, alternate site(s) access, team mobilization, offsite file, information retrieval and input data retrieval.
(5) Announced and Unannounced
a. Announced exercises are scheduled exercises generally involving actual resumption of overall operational capacity including IT resources.
b. Production processing is usually not interrupted, but may be planned for actual resumption and validation at the "Hot Site."
c. This type of test usually involves the entire continuity organization, including selected users along with operations and technical staff.
d. Unannounced exercises are surprise technical exercises that require processing to be actually recovered at the alternate site.
e. Production processing continues in parallel and is not interrupted.
f. This type of test generally involves only a small portion of the continuity organization and few, if any, users.
(6) When to Exercise or Test
a. Exercises or Tests will be conducted when--
b. The COOP is first developed and implemented,
c. A major revision to the COOP has been completed,
d. When significant changes in operating systems, applications and/or data communications has occurred,
e. The preparedness level of continuity teams should be verified, and
f. At least annually.
(7) Responsibility for Establishing Exercise Scenarios
a. The Program Manager, or his/her designee, operating under COOP, is responsible for developing the strategy for each exercise.
b. Development of procedures that measure the effectiveness of the COOP will address the following plan elements:
iii. Resources and Vital Records
(8) Exercise and Test Scenarios
a. Exercise scenarios are normally developed to accomplish the objectives established by Senior Management.
b. Some considerations in developing exercise scenarios include:
i. Reexercising the plan segments that were determined to be deficient in past exercises.
ii. Exercising time-sensitive application systems that have never been recovered or restored, or have not been recently exercised.
iii. Involving those continuity organization team members that need more training and preparation to maintain familiarity with their operations.
iv. Ensuring that each exercise involves the use of only offsite storage and inventory items to ensure completeness and accuracy of the offsite inventory.
v. Deciding whether the exercise and associated parameters will be openly announced or will be a surprise. This decision is usually made at the discretion of the enterprise business continuity officials.
(9) Exercise and Test Evaluation
a. An unbiased evaluation team should be assigned and will evaluate the results of each exercise or test.
b. This team should be made up of personnel external to the organization conducting the exercise or test.
c. The evaluation team should be focused entirely on the validity, currency, and capability of the COOP to recover and restore NASA time-sensitive application systems at the alternate site(s).
d. Recommended members of the evaluation team include --
i. Center Emergency Preparedness and Response Personnel.
ii. Center Safety Officials.
iii. Security Officer.
iv. Vital Records Manager.
v. Program Management.
vi. Other Center Officials, as appropriate.
(10) The Exercise Evaluation Team is charged with the following responsibilities--
i. Familiarization with the overall COOP.
ii. Understanding thoroughly the objectives of the exercise or test to be conducted.
iii. Monitoring and observing all the activities of the teams involved in the exercise or test.
iv. Ensuring that the exercise or test objectives are met, from the organization's and client's perspective.
v. Documenting findings relating to the strengths and weaknesses observed during the exercise or test.
(11) Reviewing Exercise and Test Results
a Team leaders and program management will document exercise or test results as soon as possible, but not later than 2 weeks after completion of an announced or unannounced exercise or test.
b. Selected members of the continuity organization will review the exercise and test results and resolve weaknesses and problems.
c. The project manager, or his/her designee, will chair the review and coordinate appropriate changes and updates to the COOP.
d. The results of the review will be presented to the Center Director, appropriate management personnel, and the appropriate Enterprise Business Continuity Official(s).
e. A copy of the exercise or test results will be provided to the Agency and Center COOP coordinators, respectively.
(12) Schedule of Exercises
a. The Program Manager, or his/her designee, will schedule exercises in coordination with the Center COOP Coordinator.
b. Exercises should be scheduled with consideration to seasonal production and business cycles, the number of processing systems or platforms in production, and the time required to exercise both time-sensitive processes to full production systems.
j. Education and Training
a. Awareness of the need for and the processing of maintaining a viable continuity capability are essential and federally mandated.
b. This awareness will be achieved through formal education and training sessions conducted on at least an annual basis.
c. This provides a way of ensuring that the necessary understanding of the COOP program and processes are understood by the personnel responsible for maintaining and implementing the plan.
d. The objectives of COOP training are to--
i. Train all key employees and management who are required to help maintain the plan in a constant state of readiness.
ii. Train key employees and management who are required to execute various plan segments in the event of an extended disruption in normal operations.
iii. Heighten planning awareness for those employees not directly involved in maintaining and/or executing the Plan.
e. The individual Center COOP Coordinator will schedule educational seminars addressing individual COOP activity at least, but not less than, semiannually.
f. These seminars will include overviews of the--
i. Continuity strategy, priorities, and timeframes.
ii. Business continuity organization structure and responsibilities.
iii. Individual COOP structure and contents.
iv. Data preservation methodologies and practices.
v. Mobilization, transportation, transfer of actions to alternate site(s).
vi. Plan administration, maintenance, and exercises.
| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | AppendixA | AppendixB | ALL |
|| NODIS Library | Organization and Administration(1000s) | Search ||
This document does not bind the public, except as authorized by law or as incorporated into a contract. This document is uncontrolled when printed. Check the NASA Online Directives Information System (NODIS) Library to verify that this is the correct version before use: https://nodis3.gsfc.nasa.gov.