![]() |
NASA Procedural Requirements |
NPR 1040.1 Effective Date: July 03, 2003 Expiration Date: August 03, 2024 |
| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | AppendixA | AppendixB | ALL | |
3.1.1 COOP involves more than planning for a move offsite if a disastrous event destroys or disrupts a mission-essential operation, function, facility(ies), supporting IT system(s), or other interdependent essential infrastructure, on which the mission-essential operation depends.
3.1.2 COOP will also address how to keep an organization???s mission-essential operations and supporting systems operating in case of long-term disruptions.
3.2.1 To ensure that the Agency???s critical operations are thoroughly reviewed for COOP consideration, Centers will assess Agency Mission Essential Infrastructure (MEI), supporting operations, and other interdependencies, and evaluate that infrastructure from a risk to the national welfare perspective, which by themselves or as a result of a Memorandum of Understanding, or other agreement with another Federal agency, will continue to operate or remain capable to operate at the primary or an alternate location under all emergency circumstances.
3.2.2 Center MEI inventories will be identified and maintained by the Center Critical Infrastructure Assurance Office (CIAO) per requirements found in NPR 1620.1, NASA Security Procedural Requirements, as amended.
3.2.3 Centers should use the following criteria when making COOP judgments. These criteria shall serve to identify essential operations that require development of COOP:
a. Would the loss of a Center MEI capability or operation compromise national security?
b. Would the loss of a Center mission-essential infrastructure capability or operation have an immediate and significant adverse effect on the health and safety of the general public at large?
c. Is a NASA Center mission-essential capability or operation critical to the performance of another agency's COOP essential operations and required, by agreement, to remain viable, without interruption, under all emergency conditions?
d. Is the NASA mission-essential capability or operation regulated, legislated, or directed by Executive order to operate under all emergency scenarios?
e. Is the mission-essential capability or operation tied into a space exploration vehicle and equipment command and control operations that if rendered inoperable, would place personnel, vehicles and/or equipment at risk?Would the cost to recover from such an event exceed NASA's budget capability?
f. Is the mission-essential capability or operation a deemed vital service, as determined by NASA management and, therefore, required under COOP?
3.2.4 The ability for NASA???s Senior Management to continue to manage the Agency and individual Centers during a disastrous event is inherently critical to NASA and the U.S. Government. Essential management operations will be included under a COOP.
3.2.5 NASA assets identified as MEI which may, due to their size, configuration, and age, be difficult and expensive or impractical to relocate to an alternate facility or rebuild if destroyed (e.g., wind tunnels, Local Area Network (LAN), Wide Area Network (WAN)). They should be carefully evaluated under COOP criteria to ensure that all aspects of their criticality and irreplaceability are thoroughly considered, before establishing a COOP.
3.3.1 Evaluation of Identified Mission-Essential Operations
a. The evaluation of an organization's MEI operations generally centers around the organization's business plan.
b. Because the development of the business plan is used to support the continuity of operations planning process, it is necessary, not only to ensure accurate evaluation of essential operations and business processes, in accordance with criteria established in paragraph 3.2, but also to set priorities and time criticalities for them.
c. The Program Manager and staff are responsible for ensuring the completion of the business plan and for prioritizing the resumption, recovery, or restoration needs for the organization???s mission-essential operations, if any.
d. Because a fully redundant capability for each function is prohibitively expensive for most organizations, certain operations will not be performed in case of a disaster.
e. If appropriate priorities have not been set, it could make a difference in the organization's ability to survive a disastrous event.
f. Development of a Mission Statement.
(1) Government departments, divisions, and offices generally have a formal statement concerning the mission(s) to be performed.
(2) These statements may be contained in organization policies or directives, handbooks, or public information guides.
(3) Regardless of the source, the criticality determination process starts with an overall mission statement that identifies what the organization is responsible for doing.
g. Functional Activities Listing.
(1) The process continues by developing a list of all operations performed by the office in support of the essential mission.
(2) In parallel with this listing, it is also necessary to identify those operations that are dependent on specialized support (such as IT Systems, communications, certain data or records, physical infrastructure, human resources) as well as the extent of that dependency (i.e., is the function totally dependent on a particular type of support, is only some portion that can be quantified dependent on such support, or could the function be performed manually with little or no loss of efficiency).
(3) See Paragraph 3.3 for a more detailed discussion of specialized support resources.
(4) Any special requirements affecting the performance of the function or relating to the information involved should also be noted.
(5) These could include the sensitivity of data, or whether there is a specific timeframe when data are more critical than other times.
h. Criticality Matrix.
(1) The next element is the development of a criticality matrix.
(2) Criticality guidelines shall be developed which identify those office operations that deal with aspects that are critical to any Government agency, and the timeframes that will be associated with those factors.
(3) Developing a matrix, similar to that used for determining sensitivity and protection requirements in system security plans, is one approach to determining criticality.
(4) Most NASA offices and operating Centers are not involved with the more obviously critical factors, such as saving lives or national defense.
(5) They should develop an office-specific list of critical operations.
(6) The primary objective is to identify only those mission-essential operations that, if not performed, will cause the greatest loss to the office in terms of inability to operate and the expenditure of additional funds.
(7) Reserved.
(8) These guidelines should permit each function to be evaluated in terms of the importance of the function in accomplishing the mission of the office and how quickly this function will be performed.
(9) The longer the function can do without specialized support, the less critical is the specialized support, hence criticality is a function of time.
i. Criticality Determination.
(1) The next part of the process is to compare the functional activities against the criticality determinations and corresponding timeframes.
(2) A COOP is concerned with all mission-essential operations and goes beyond those functions requiring only IT processing and operations.
(3) All office operations are compared against the criticality determinations and time factors, and against each other.
(4) The result is a prioritized list of mission-essential activities, based on criticality, and reflected in terms of the maximum timeframe that these essential operations are not performed before the organization fails to accomplish its mission.
3.3.2 Identification of Resources, Vital Records, and Interdependencies that support NASA's Mission-Essential Operations.
a. After essential missions and business operations are identified, support resources and vital records will be identified, as well as the timeframes in which each resource is used and the effect of unavailable resources on the essential operations in support of the mission.
b. It is important to note that the COOP resources inventory will consist of only those physical resources, vital records, and support services necessary for an office and organization to perform the essential parts of its mission.
c. The COOP does not provide for the immediate or eventual replacement of all existing resources at an alternate site. Rather, it is intended to implement a viable and effective essential function in an alternate location for a minimum of at least 30 days.
d. In addition to precisely identifying the minimum levels of resources required to activate a temporary office, the resources inventory should also identify who is responsible for each category of items,where the existing items are located, (and if backup supplies already exist, where they are located, and in what quantity), what and where is the source of replacement or resupply, and in some instances, what is the cost and timeframe for replacement.
e. As COOP progresses, preparatory actions will drive the modification or expansion of certain inventory data.
f. Continuity of operations planning should address all the resources needed to perform an essential function, including:
(1) Human Resources.
(a) Human resources requirements include essential management staff, operational and support personnel, systems users, and security personnel.
(b) Some essential operations require personnel with special expertise or training, while others require lesser skill levels.
(c) Security is especially critical when potential for continuous protection of a vacated site, and protection of an alternate site need to be considered simultaneously.
(d) Additionally, the human resources aspect of continuity of operations planning includes establishment of plans of succession and Delegations of Authority (DOA) for both Headquarters Operations and individual Centers.
(e) COOP planners will also consider Plans of Succession and DOA for each organization, program, or project operating under a COOP.
(2) Processing Capability.
(a) Traditionally, contingency planning has focused on processing power.
(b) Although the need for data backup remains vital, today's other processing alternatives are also important.
(c) LAN's, microcomputers, workstations, and personal computers in all forms of centralized and distributed processing may be performing critical tasks.
(3) Automated Applications and Data.
(a) NASA information systems run applications that process all types of data, run all types or programs, and reach far into space.
(b) Without current electronic versions of both vital applications and data, computerized processing may not be possible.
(c) If the processing is being performed on alternate hardware, the applications will be compatible with the primary hardware, operating systems and other software (including version and configuration), and numerous other technical factors.
(4) IT-Based Services.
(a) NASA uses many different kinds of IT-based services to perform most, if not all, of its essential and nonessential operations and functions.
(b) The two most important IT services are normally communications services and information services.
(c) Communications can be further categorized as data and voice and in some instances, satellite.
(d) However, in many Centers these may be managed by the same service. Information services include any source of information outside of the organization. Most of these sources are automated, including Online Government and private databases, the Internet, and external e-mail.
(5) Secure Communications.
(a) COOP will include the development and implementation of secure communications capability for key personnel, when appropriate.
(b) COOP will include hard-wire and wireless capability, requirements and procedures for use, pre-event purchase and deployment, familiarization, and training.
(6) Communication with Key Government Officials.
(a) COOP will be developed and properly coordinated to ensure communications capability with key Government Officials (e.g., The President, The Vice President, National Command Authority (NCA), Department of Homeland Security (DHS), and others as necessary, which may be effected by the NASA Administrator during an emergency event affecting the Washington, DC, area, resulting in mass evacuation of Government agencies, or under any other emergency situation in which key personnel of the Federal Government may become widely dispersed and require dependable and secure modes of communication with the executive branch.
(7) Telecommunications Service Priority (TSP) Program.
(a) The TSP Program is a Federal Communications Commission (FCC) program used to identify and prioritize telecommunication services that support National Security or Emergency Preparedness (NS/EP) missions.
(b) The TSP Program also provides a legal means for the telecommunications industry to provide preferential treatment to services enrolled in the program.
(c) Center organizations establishing COOP capability should consider participation in the TSP Program, where appropriate.
(8) Physical Infrastructure.
(a) Physical infrastructure elements include a safe working environment and appropriate equipment and utilities.
(b) This can include office space, heating, cooling, venting, power (including determining the need and source of uninterrupted power), water, sewage, other utilities, desks, fax machines, personal computers, terminals, courier services, file cabinets, and many other items.
(c) In addition, computers also need space and utilities, such as electricity, communications lines (connectivity). Electronic and paper media used to store applications and data may also have specific physical requirements.
(9) Vital Records, Documents, and Papers.
(a) The performance of many NASA operations relies on vital records and various documents, papers, or forms.
(b) These records could be important because of legal need, or because they are the only record of the information.
(c) Records can be maintained on paper, microfiche, microfilm, magnetic media, or optical disk.
3.3.3 Establishment of Delegations of Authority (DOA)
a. DOA???s should be pre-established to enable designated personnel to make the appropriate policy determinations at headquarters, Centers, and other organizational levels, as deemed appropriate, to ensure rapid response to any emergency situation requiring COOP implementation.
b. These DOA???s should be included as an appendix to the COOP Plan and the following.
(1) Identify the programs and administrative authorities needed for effective operations at all organizational levels having emergency responsibilities.
(2)Identify the circumstances under which the authorities would be exercised.
(3) Document the necessary authorities at all points where emergency actions may be required, delineating the limits of the authority and accountability.
(4) Clearly state the authority of designated successors, to exercise Agency direction, including any exceptions, and the successor's authority to redelegate operations and activities, as appropriate.
(5) Indicate the circumstances under which the delegated authorities would become effective and when they would terminate.
(6) Ensure that officials who may be expected to assume authorities in an emergency are trained to carry them out.
(7) Be appropriately updated upon transfer, termination, or other personnel action resulting in the individual's departure.
3.3.4 Plans of Succession (POS.)
a. NASA Headquarters and individual Centers will establish, promulgate, and maintain POS to key positions.
b. POS are an essential part of NASA's COOP activity and will be included in the COOP plan as an appendix.
c. POS should be sufficient in depth to ensure NASA's ability to perform essential operations while remaining a viable part of the Federal Government during any emergency.
d. Geographical dispersion is essential, consistent with the requirement for ensuring appropriate succession to office in emergencies of all types.
e. Each principle NASA activity will, as appropriate.
(1) Establish POS to the position of NASA Associate and Assistant Administrators.
(2) Establish POS to other key Headquarters leadership positions.
(3) Establish POS for each Center.
(4) Identify any limitations of authority based on DOA's to others.
(5) Describe POS by positions or title, rather than names of individuals.
(6) Include the POS in the vital records inventory of the Agency and Center.
(7) Revise POS as necessary and distribute revised versions promptly as changes occur.
(8) Establish the rules and procedures that designated officials are to follow when facing the issues of succession to an office in emergency situations.
(9) Include in succession procedures the conditions under which succession will take place, method of notification, and any temporal, geographical, or organizational limitations of authorities.
(10) Assign successors to the extent possible among emergency teams established to perform essential operations, to ensure each team has an equitable share of duly constituted leadership.
(11) Conduct orientation and training programs to prepare successors for their emergency duties.
3.3.5 Anticipation of Potential Contingencies or Disasters.
a. Although it is impossible to anticipate everything that can go wrong, this step involves identifying a likely range of problems.
b. Developing scenarios can help an organization to prepare a plan that addresses a wide range of possible mishaps.
c. COOP planners should consider that the hanging-threat environment, including military or terrorist attack-related incidents, have shifted awareness to the need for COOP capabilities that enable NASA to continue its mission-essential operations across a broad spectrum of emergencies.
3.3.6 Selection of Continuity of Operations Planning Strategies.
a. When strategies are developed and evaluated, existing controls for preventing and minimizing losses should be considered.
b. Because no one set of controls can prevent all losses in a cost-effective manner, prevention and recovery efforts should be coordinated.
c. Risk assessments, conducted by the COOP management team, can also help determine an optimal strategy.
d. A COOP strategy normally consists of five parts: prevention, response, resumption, recovery, and restoration of services or operations:
(1) Prevention refers to those measures taken to forestall a disruption of service, (e.g. preventive maintenance, virus prevention, physical and/or procedural security measures as developed under the Agency Critical Infrastructure Protection Program).
(2) Response encompasses the initial actions taken to protect lives and limit damage.
(3) Resumption refers to the steps taken to continue support for critical operations.
(4) Recovery concerns the reactivation of a greater scope of business processes and services beyond the most time-sensitive processes.
(5) Restoration is the return to normal operations.
e. The longer it takes to restore normal operations, the longer the organization will have to operate in the resumption or recovery mode.
f. The selection of a strategy needs to be based on practical considerations, including feasibility and cost.
g. Different categories of resources should also be considered.
3.3.6.1 Human Resources.
a. During a major continuity plan implementation, people will be under significant stress and may panic.
b. If the continuity plan is implemented as a result of a local or regional disaster, their first concerns will probably be their family and property.
c. In addition, many people will be either unwilling or unable to come or remain at work, or travel to an alternate site to assist in resumption of operations.
d. Cross-training of employees inside and outside the affected organization is one way to ensure availability of sufficient personnel to assist in maintaining essential mission capability.
e. Additional hiring or temporary services are also available but should be carefully considered and weighed against any possible security vulnerabilities.
3.3.6.2 IT Processing Capability.
a. For mission-essential operations that rely heavily on IT support, less serious events involving short-term disruptions, processing capabilities can be restored from backups or original media, by repairing equipment components, or by purchasing new equipment.
b. Federal agencies have the authority to issue purchase orders to quickly acquire needed equipment and supplies in limited quantities.
c. This authority is usually limited but is sufficient to acquire Commercial Off-The-Shelf (COTS) hardware and software.
d. Essential hardware could be acquired by purchase orders in one of two ways, purchase of replacement or upgraded equipment or lease of essential equipment for a limited period of time.
e. The outright purchase of identical replacement hardware is the most obvious use of the purchase order option.
f. However, the purchase of upgraded equipment is a reasonable alternative, given the fact that the existing equipment may not be economically salvageable.
g. On the other hand, the short-term lease of essential equipment to augment or temporarily replace existing equipment during salvage operations provides a cost-effective alternative.
h. There is, however, a risk involved with the leasing of equipment that will be addressed for long-term events under a COOP. That risk is the fact that those systems will be thoroughly scrubbed prior to deployment to ensure they are free of infected hardware and software and in preparation for return to the vendor to ensure the integrity and protection of information that has been stored or processed on the machines.
i. Although these two options could offset the lack of facilities and equipment at the time of the disaster, they are subject to the disadvantages of high cost and long preparation time.
j. In a widespread disaster, the requirements for space, hardware, communications, could temporarily exceed demand.
k. These two options will also be used in combination because neither provides for facilities (to include associated utilities and communications) and equipment, furnishings, and supplies.
l. For less serious events involving short-term disruptions, the COOP defers to the use of in-house contingency plans developed under the auspices of the Center Emergency Preparedness Program or IT Security Plans, developed under the requirements of OMB Circular A-130.
m. For a more serious continuity event, however, the strategies for ensuring operational capability are normally grouped in five categories:
(1) Category 1. Hot Site. A hot site is a building already equipped with processing capability and other services.
(a) Operational standby facilities require a subscription contract and charge various fees.
(b) Normally, a 3 or 5-year contract is negotiated and includes a specific hardware and software configuration with detailed communications requirements, which will be updated whenever changes occur. Subscription fees are determined by these requirements.
(c) The reduction in costs for the minimum essential capabilities required by a COOP is not significant and may not be warranted for continuity of operations.
(d) Another potential drawback for the IT user is that these services are relatively new and not widely dispersed.
(e) Therefore, a Hot Site facility may not be conveniently located.
(2) Category 2. Cold Site. A Cold Site is a building for housing alternate operations and/or processors that can be easily adapted for use.
(a) Such a facility may be owned by NASA, situated at a NASA Center, owned by another Government agency (e.g., DOD, GSA), or Government-leased for one or more organizations.
(b) In the event of a disastrous event, the affected office(s), in conjunction with hardware vendors, acquires and installs the essential IT hardware, software, and communications.
(c) Cold Sites are more practical for IT-based operations since a shell facility may be owned by NASA or the facility can be virtually any office space with sufficient electrical power, communications line capability (installed or capable of being installed during a disaster situation), and regular air conditioning.
(d) IT hardware is more readily available, more easily shipped, and more easily installed.
(e) A Cold Site may also be supported by a special equipment contract (if not already in place as part of a standard hardware maintenance agreement).
(f) There are a number of hardware vendors who offer guaranteed delivery and setup within 24-hours.
(g) Although the maintenance costs are somewhat less than an operational standby, they represent a continuing expense.
(h) For IT activities, consideration should be given to leasing essential IT equipment and peripherals to augment equipment salvage from the primary site or to temporarily replace essential hardware until the primary site can be restored without additional disruption to IT configuration.
(i) Leasing eliminates the ongoing maintenance costs of a special equipment contract but does not provide for the guarantees that appropriate equipment will be available when needed or within required timeframes.
(j) As described earlier, leasing also creates the problem of data security, as special precautions will be taken to ensure that all data that has been stored or processed on the system have been removed from the leased equipment.
(k) This practice requires more than a simple deletion of the data as deleted files can still be detected, identified, and restored.
(l) The site availability timeframe, which includes hardware, communications, and equipment installation, may not meet organizational or system requirements as set forth in continuity plans.
(3) Category 3. Redundant Site. A redundant site is a site that is equipped and configured exactly like the primary site.
(a) It is either operating in parallel with the primary site or can be activated at a moment's notice.
(b) A redundant site is critical in situations where the operational reliability of the asset is 100 percent and cannot be interrupted for any length of time.
(4) Category 4. Reciprocal Agreement. A reciprocal agreement is a formal agreement that allows two organizations to back up each other.
(a) The agreement is usually with an external organization, for the two to provide backup IT processing support to one another in the event of a disruption in primary processing support.
(b) The external office, division, or directorate is not in the business of providing IT support, but agrees to provide reciprocal support in recognition of mutual backup requirement.
(c) Although low development and maintenance costs are the principle advantage to this alternative, consideration will be given to establishing an agreement with an organization that will not be affected by the same disaster.
(d) Reaching an agreement with another activity, such as a counterpart office in another division or operating Center, provides no effective continuity of operations capability if that activity is affected by the same disaster.
(e) The activities establishing a mutual assistance agreement should be geographically separated.
(f) The biggest disadvantage of mutual assistance agreements is that, "Their disaster becomes your disaster."
(g) Many of the disadvantages noted above identify areas of hardship and general inconvenience to both activities.
(h) Without a specific mission-essential operation, required staffing, supporting specialized services (e.g., IT system, site, other interdependencies), and pair of organizations in mind, it is difficult to evaluate a mutual-assistance agreement alternative completely and fairly.
(i) Mutual-assistance agreements are not considered viable solutions without a formal agreement outlining all conditions and signed by individuals in positions of authority to uphold the agreement.
(5) Category 5. Hybrids. Any combinations of the above, such as having a Hot Site as a backup in case a redundant or reciprocal agreement site is damaged by a separate contingency.
(a) In addition to these five alternatives, another approach readily available to mission-essential operations dependent on IT environments, is to allow key staff to work at home (telecommute) during the emergency event.
(b) Even limited use of this alternative could ease the continuity of operations burden by reducing or eliminating the need to provide suitable office space and to acquire hardware and/or software assets.
(c) In addition to reduced costs, it offers the advantage of immediate availability.
(d) It can also serve to reduce the level of anxiety staff may experience if separated from their families during the event.
(e) It can be thoroughly tested; however, there are also disadvantages.
(f) The event may be so widespread as to have disrupted service in a region as opposed to local area.
(g) Also, technical and maintenance support to privately owned property poses legal difficulties and limits sustainability, while information security and anti-viral protection could become issues requiring well thought-out solutions.
(h) Use of Government-owned IT equipment for use at home, or the requirement for designated personnel to take home issued laptops on a daily basis, could reduce legal and operational concerns.
Figure 1 presents a set of evaluation characteristics that may be used to help weigh the alternatives for determining alternate site operational and processing capability.
Evaluation Characteristic |
Planning Considerations |
Compatibility |
Hardware, software, and communications that are or would have to be installed at an alternate site will be the same as or compatible with original equipment supported. |
Accessibility |
The alternate site will be readily accessible, but not so close as to share the same disaster. |
Reliability |
The alternate site will be capable of supporting the operations of the affected office(s) 24 hours a day, 7 days a week. Maintenance for site equipment, hardware, and communications should be on-site or on-call. |
Capacity |
The alternate site and facility and computer equipment will have sufficient floor space, heating, cooling, and power (including uninterrupted power when required), communications lines, and memory capacity to support the number of staff and suite of equipment required. |
Security |
Physical security at the alternate site will be sufficient to protect personnel, property, and the sensitivity of the information and data. Security assessments will need to be conducted by assigned security personnel. |
Time to prepare |
There will be sufficient time to prepare for the disaster, including time to prepare and convert data and software; prepare the site; prepare and store supplies, forms, and documentation; obtain and install power and communications circuits; and prepare and test the COOP. |
Support and assistance |
There will be on-site technical support and assistance to set up and configure the hardware, software, and communications. |
Cost |
Cost factors can be subdivided into three categories:
|
3.3.6.3. Automated Applications and Data
a. Normally, the primary contingency or continuity strategy for applications and data is regular backup and secure offsite storage.
b. Important issues to be addressed include the frequency of backups, the frequency of offsite storage, and the manner of transporting backups.
c. Office policy should require IT or LAN administrators to maintain separate master copies immediately upon implementation of approved changes, store the masters in a secure offsite location, together with copies of all applicable hardcopy documentation and operating manuals.
d. A similar policy should require the appropriate individual(s) to prepare backup copies of all electronic files on a regular (e.g., not less than weekly) basis, to maintain copies of all required references and hardcopy files, and to store the backup copies in a secure offsite location.
e. In the IT environment, the volume of equipment and supplies to be stored is relatively small, based on the nature of the media involved (diskettes, 8 mm cartridge tapes, CD's).
f. Headquarters and individual Center Data Centers and operational LAN's/WAN's make provisions for storing of these types of materials in support of Agency activities.
g. Hardcopy data could be stored on a permanent retention basis in local agency, Center, or general Federal storage facilities.
3.3.6.4. IT-Based Services.
a. Communications is also a key discriminator in selecting an appropriate COOP alternative.
b. Incompatible communications or insufficient lines may disqualify a site or option.
c. The COOP planner will ensure that adequate compatible communications are available at the alternate site or that they can be provided during a disaster situation.
d. As appropriate, an agreement with a communications vendor will be negotiated. This agreement will cover all necessary voice, data, and image communications.
e. Separate agreements will also be negotiated with equipment vendors for modems, fax machines, telephones, encryptions devices, and keys, if required.
f. Service providers may offer contingency services.
g. Voice communications carriers often can reroute calls to a new location, and data communications carriers can also reroute traffic.
h. Local voice service may be carried via cellular phones.
i. If one service is down, it may be possible to use another.
j. Resuming normal operations may require rerouting of communications.
3.3.6.5. Physical Infrastructure.
a. Arrangements will be made for office space, furniture, data and communications processing capability, other support, and more, as applicable.
b. If the COOP calls for moving offsite to an alternate facility, procedures need to be developed to ensure a smooth transition back to the primary facility or to a new permanent location.
c. A related alternative available to Federal agencies is the U.S. Government's procurement system.
d. The General Services Administration (GSA) has the responsibility of assisting agencies in acquiring additional office space on an "as required" basis for Federal agencies, and the responsibility of managing standing contracts for goods and services. See Federal Response Plan, 9239.1-PL.
e. Because minimal space is required for continuity of operations activities, this capability permits fairly rapid acquisition of space without the "overhead" costs of rents or subscription fees.
3.3.6.6. Vital Records, Documents, and Papers.
a. The primary contingency strategy is usually backing up into magnetic, optical, microfiche, or other medium and offsite storage.
b. Copies of such records should be cycled on a schedule to be determined by the organization to ensure that the copies are current and acceptable.
c. A supply of forms and other needed papers can be stored offsite as well.
d. Backup storage space should be located close enough to the primary site for convenience in placing items into storage on a regular basis, but not so close that it will be affected by the same disaster.
e. Onsite (i.e., same office and building) storage is not recommended.
3.3.7. Documenting Continuity of Operations Planning Strategies.
a. With continuity of operations strategies well defined, the next step is to create the COOP itself.
b. The COOP needs to be written, kept up-to-date as the organization, systems, and other factors change, and stored in a safe place.
c. A written plan is critical during a continuity of operations event, especially if the person who developed the plan is unavailable to assist in its implementation.
d. It should clearly state in simple language the sequence of tasks to be performed in the event of a contingency so that someone with minimal knowledge could immediately begin to execute the plan.
e. It is generally helpful to store up-to-date copies of the COOP in several locations, including any offsite locations, such as alternate processing sites or backup data storage facilities. The structure of the COOP includes:
(1) Plan Overview - consists of an introduction, statement of policy, objectives, scope, assumptions, recovery strategy, and plan administration responsibilities.
(2) Continuity Process Overview - outlines the four major stages of the process, after prevention (emergency response, resumption, recovery, and restoration), including the central activities and objectives of each stage, and the relationships among stages.
(3) Continuity Team Organization - defines the specific organization set up to work towards survival and the resumption of time-sensitive essential operations. The teams associated with this plan represent office functional units and/or support operations developed to respond, resume, recover, or restore operations of the facility and/or system. Each team is comprised of individuals with specific responsibilities or tasks that will be completed to fully execute the plan. The organization will be based upon the emergency incident command structure established by the National Interagency Incident Management System (NIIMS). Figure 2 below, presents a representative example of a Continuity Team Organization structure.
(4) Plan Maintenance - includes both scheduled and unscheduled maintenance, as well as periodic reevaluation process. Scheduled maintenance consists of quarterly reviews and updates, as well as annual structured walk-through and/or tactical exercises (as described in the Plan Exercise section below). The purpose of the plan review is to determine whether changes are required to strategies, tasks, procedures, the continuity organization, and notification procedures. The majority of unscheduled maintenance activities occur as a result of major changes to service level agreements, hardware configurations, networks, and production processing. The Continuity Plan maintenance process should also include a periodic reevaluation of the minimum hardware capacity required to provide short-term response, resumption, recovery, and restoration capability. The reevaluation process will address the capacity growth requirements associated with the increase of transaction processing volumes of the production application systems, as well as the addition of new systems to the production environment.
(5) Plan Exercise - consists of the various types and scope of exercises designed to test and evaluate the COOP. Exercises should be conducted not less than annually, and when a COOP has been implemented, major revision to the plan has been completed, when additional systems or requirements are implemented, when significant changes in systems, applications and/or data communications have occurred, and when the preparedness level of continuity teams will be verified. An exercise may include structured walk-throughs, tactical exercises, live production exercises, simulations, and announced and unannounced exercises.
(6) Plan Execution - details for plan execution reside in the Appendices to the COOP itself. These Appendices contain specific data required by the various teams to perform their designated roles during each stage of the process. Appendices include--
(a) Priority Contact List - includes employee names and contact information.
(b) Employee/Contractor Notification List - contains a directed list of who is to contact who regarding the communication of continuity information.
(c) Team Member Roster - identifies the specific individuals belonging to each Team and their contact information.
(d) Team Task List with Dependencies - consists of detailed, step-by-step listing of each task to be performed by the members of the various continuity teams. Where a specific task will await action by a member of another team, this is so noted, and the task and responsible individual is identified. This area is key to the entire COOP.
(e) Enterprise Process Configuration - lists, for each IT system or process, the associated software, equipment, supplies, network information, and responsible teams.
(f) Vendor Representatives - contains a listing of all applicable vendor contact information, including local representatives and focal points within the organization.
(g) Location Information - contains the location of all offsite storage, alternate operating locations (Hot or Cold Sites), record repositories. Driving or transportation instructions and personnel focal point contact information is also included for each location.
(h) Vital Records - includes a listing of all necessary emergency operating documents, manuals, diskettes, CD-ROMS, and all other media necessary to implement the COOP. Additionally, this includes appropriate personnel, legal, and financial records that if lost would seriously impact the Agency, Center, and organization's capability to conduct mission-essential operations, and in some cases, seriously impact recovery and resumption of normal operations.
3.3.8. Test and Revise Strategy.
a. A COOP should be tested to train personnel and to keep the plan in step with changes to the operating environment.
b. The extent and frequency of testing will vary among organizations, systems, and particular mission.
c. There are several types of testing--
(1) Review: This is a simple test to check the accuracy of the COOP. For instance, a reviewer can check the accuracy of contact telephone numbers, building and room numbers, and whether the listed individuals are still in the organization.
(2) Analysis: An analysis may be performed on the entire plan or parts of it. The analyst may mentally follow the strategies in the COOP and look for flaws in the logic or process used by the plan's developers. The analyst may also interview functional managers, resource managers, and their staff to detect missing or unworkable pieces of the plan.
(3) Simulation and Test: Simulation and test consists of various types and scope of exercises designed to test and evaluate the COOP. In the structured walk-through, a disaster scenario is established, and the teams "walk-through" their assigned tasks. This is role-playing activity that requires the participation of at least the team leaders and their alternates. A tactical exercise is a simulated exercise, conducted in a "war game" format. All members of the continuity organization are required to participate and perform their tasks and procedures under announced or surprise conditions. The exercise monitor provides information throughout the exercise to simulate events following an actual disaster. In a live production system exercise, an operating system is brought to live status on alternate platforms, and the data and communications network is switched to the alternate site. All resources, other than IT and communications hardware and software needed to support the exercise, will be retrieved and deployed from off site (protected) storage, as applicable. A simulation requires the execution of otification, operating procedures, the use of equipment, hardware and software, possible use of alternate site(s), and operations to ensure proper performance. Simulation exercises should be used in conjunction with checklist exercises for identification of required plan modification and staff training.
(a) Announced exercises are scheduled exercises generally involving actual resumption of IT and other critical operations (e.g., command and control) at alternate site(s). IT operations are usually not interrupted but may be planned for actual resumption and validation at the "Hot Site." This type of test usually involves the entire continuity organization, including selected users along with Senior Management, operations and technical staff. Unannounced exercises are surprise exercises that require transfer of operations activity to the alternate site. All required activity continues in parallel and is not interrupted. This type of test generally involves only a small portion of the continuity organization.
(b) To ensure that testing is performed in a cost-effective manner, while still accomplishing the objective of validating the COOP, a separate test plan, with specific scenarios and outlines of acceptable responses, should be developed and followed by management representatives, such as the team conducting the test.
(c) Because the plan will become dated as time passes and resources change, responsibility for maintaining and updating the COOP should be specifically assigned. Maintenance of the COOP can be incorporated into procedures for change management so that upgrades to hardware and software are reflected in the Plan.
| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | AppendixA | AppendixB | ALL | |
| NODIS Library | Organization and Administration(1000s) | Search | |
This document does not bind the public, except as authorized by law or as incorporated into a contract. This document is uncontrolled when printed. Check the NASA Online Directives Information System (NODIS) Library to verify that this is the correct version before use: https://nodis3.gsfc.nasa.gov.