[NASA Logo]

NASA Procedures and Guidelines

This Document is Obsolete and Is No Longer Used.
Check the NODIS Library to access the current version:
http://nodis3.gsfc.nasa.gov


NPR 8715.3
Eff. Date: January 24, 2000
Cancellation Date: September 12, 2006

NASA Safety Manual w/Change 2, 03/31/04

| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | Chapter5 | Chapter6 | Chapter7 | Chapter8 | Chapter9 | AppendixA | AppendixB | AppendixC | AppendixD | AppendixE | AppendixF | AppendixG | AppendixH | AppendixI | AppendixJ | AppendixK | ALL |


CHAPTER 3. System Safety


3.1 Purpose

This chapter establishes procedures for the implementation of system safety processes to ensure the identification and reduction of program safety risks to an acceptable level to enhance mission success.

3.2 Applicability and Scope

3.2.1 For simplicity, "programs" shall be interpreted to include programs, projects, and acquisitions (Requirement 25242). When the work is performed in-house at NASA, the term Contractor shall be interpreted to apply to the in-house activity (Requirement 32102).

3.2.2 NASA requires system safety tasks for systems acquisitions, in-house developments, facility design/modifications, and Agency operations and activities (Requirement 25243). For joint ventures between NASA and other parties including commercial services, interagency efforts, and international partnerships, application of these practices shall be as specified in related contracts, memoranda of understanding, NPDs, or other documents, and will consider the degree of NASA responsibility in the venture (Requirement 32103).

3.2.3 The program/project manager, in conjunction with the local safety and mission assurance organization, shall determine minimum mission success criteria (Requirement 25074). He or she will then determine the degree to which specific procedures and requirements contained in this chapter are implemented. They shall consider the potential for personnel injury, mission failure, equipment loss or facility damage, or property damage, the impact to cost and schedule, and the visibility of the program to the public (Requirement 32104). The process is called "tailoring." The final mission success planning activity will be documented and approved as an element of the risk management planning portion of the program plan. A safety plan may be requested as a separate document. A sample format is shown in Appendix H. (See NPR 7120.5, "Program and Project Management Process and Requirements," paragraph 4.5.1.2.)

3.2.4 Tailored system safety activities shall be planned and documented during the formulation phase for the following:

a. Aeronautical systems (Requirement 32105).

b. Human crewed and robotic space flight systems (Requirement 32106).

c. Payloads (spacecraft, internal and external payloads, and experiments flown on aircraft, Space Shuttles, International Space Station, Expendable Launch Vehicles (ELV`s), balloons, and sounding rockets) (Requirement 32107).

d. Major facilities acquisition programs (Requirement 32108).

e. Support equipment, including ground and airborne, test, maintenance, and training equipment (Requirement 32109).

f. Related safety-critical software (Requirement 32110).

3.2.5 A systematic approach to safety should also be applied to operations and supporting activities including construction, fabrication and manufacture, experimentation and test, packaging and transportation, storage, checkout, launch, flight, use, reentry, retrieval and disassembly, maintenance and refurbishment, modification, and disposal.

3.2.6 Programs with existing approved system safety tasks containing adequate definition of the risk assessment and management process are not required to comply with any new requirements of this chapter, but any changes made in their system safety task must comply with this chapter (Requirement 25077). This chapter shall not supersede or prevent the application of more stringent requirements imposed by programs (Requirement 32111).

3.3 Objective

The principal objective of a system safety activity is to provide for an organized, disciplined approach to the early identification and resolution of hazards impacting personnel, hardware, or mission success to a level as low as reasonably achievable (ALARA). The system safety activity will use the 5-step risk management approach shown in figure 3.1. (See NPR7120.5, "Program and Project Management Process and Requirements," paragraph 4.2.) The five steps of the risk management approach are as follows:

3.3.1 Identify and document the system safety and mission success risks (hazards) early in the program and continue to update the status of these risks and any newly identified risks through out the program or project.

Figure 3.1 Continuous Risk Management Process

3.3.2 Analyze the risks (hazards) for probability, impact/severity, and time frame. When that is complete, prioritize the risks.

3.3.3 Plan what should be done to eliminate or reduce the risks, and provide the planning and decisionmaking documentation to the appropriate levels of program management for a decision to eliminate, further reduce, or accept the risk. Institute hazard mitigation (corrective) actions.

3.3.4 Track the results of the corrective actions and continue to verify and validate their effectiveness.

3.3.5 Control or change the corrective action plans based on the effectiveness of the mitigation actions.

3.4 Hazard Reduction Protocol

Hazards will be mitigated according to the following stated order of precedence: (Requirement 25079).

3.4.1 Eliminate hazards.

3.4.2 Design for minimum hazards.

3.4.3 Incorporate safety devices.

3.4.4 Provide caution and warning devices.

3.4.5 Develop administrative procedures and training.

(Note 1: Providing protective clothing and equipment is considered an administrative procedure.)

(Note 2: Some hazards may require the combination of several of these approaches to mitigation.)

3.5 Responsibilities

3.5.1 Program/project managers (or equivalent) shall do the following:

3.5.1.1 Implement a tailored system safety and mission success activity based on the loss potential of the program and provide adequate resources to achieve the safety objectives (Requirement 25080). Depending upon complexity, a program will typically budget 3 to 5 percent of direct engineering and operations staff hours to support safety and mission assurance requirements.

3.5.1.2 Assign a System Safety Manager (SSM) (e.g., product assurance manager, flight safety manager, or flight assurance manager), in coordination with the Center Safety and Mission Assurance (SMA) Director, to have specific responsibility for executing the system safety tasks within the project (Requirement 25081). The onsite SSM will report to the program/project manager for program direction and to the Center SMA Director for policy and functional direction.

3.5.1.3 Implement and maintain the system safety and mission success planning portion of the risk management activity of the program plan with guidance and assistance from the local SMA organization (Requirement 25082). A separate stand-alone safety plan may be requested.

3.5.1.4 Ensure that system safety analyses appropriate to program complexity have been conducted (Requirement 25083). These analyses must include early interaction with the engineering, integration, and operations functions to ensure all hazards are identified and documented (Requirement 32112). The NASA Lessons Learned Information System (LLIS) will be used to supplement the normal program hazard assessment process.

3.5.1.5 Perform system safety and mission success reviews of the program (Requirement 25084). The greater the potential risks (e.g. complexity or visibility of the programs), the greater the independence and formality of the review required. Major programs such as the Space Shuttle or the International Space Station will have dedicated independent assessment activities (Requirement 32113).

3.5.1.6 Establish a formal, closed loop, risk acceptance process to identify and track program hazards with residual risk (Requirement 25085). Ensure residual risks are accepted in writing (Requirement 32114). Regardless of the size of the program, only the program/project manager or system acquisition manager is permitted to accept residual critical and catastrophic safety risks. A sample format for risk identification, assessment, and approval is in Appendix E. In all cases, where a decision is made to accept a risk, that decision will be coordinated with the governing SMA organization and communicated to the next higher level of management for review (Requirement 32115).

3.5.1.7 Issue program directives, specifications, and standards that provide uniform and systematic application of safety policy and requirements (Requirement 5086).

3.5.1.8 Assign sufficient numbers of personnel of appropriate experience and skills to perform system safety tasks (Requirement 25087). Provide training when necessary (Requirement 32116).

3.5.2 Assigned system safety managers shall do the following:

3.5.2.1 Possess appropriate technical and managerial training and expertise for conducting an effective safety process (Requirement 32117).

3.5.2.2 Advise the program/project manager regarding NASA requirements for and status of the tailored system safety task (Requirement 25089).

b. Organize the system safety effort to ensure maximum effectiveness in interacting with engineering, operations, integration, and program management (Requirement 32119).

c. Ensure specific safety requirements are integrated into overall programmatic requirements, and are reflected in applicable specifications and planning documents (Requirement 32120).

d. Determine which required hazard analysis tools and techniques (see Appendix D) will be used to ensure compliance with NASA and program safety policy and directives and when they will be used to produce safety and mission assurance documentation (Requirement 32121). Ensure the selected tools and techniques are used in an iterative process to identify all program hazards, causes, detailed control requirements, and control verifications (Requirement 32122).

e. Determine reporting requirements for all levels of the originating organization to support the system safety task (i.e., contractor, element, or NASA organization) (Requirement 32123). Establish criteria for submittal (milestone, periodic, event), format, and distribution, and ensure the program provides for submittal of the required reports (Requirement 32124).

f. Assist the program/project manager in documenting and communicating the acceptance of risks (Requirement 32125).

3.5.2.4 Conduct periodic independent reviews of the system safety tasks keyed to program milestones (Requirement 25091).

3.5.2.5 Assist and support independent review groups chartered to provide independent assessment of the program (Requirement 25092).

3.5.2.6 Maintain an up-to-date database of identified hazards throughout the life of the program (Requirement 25093).

3.5.2.7 Maintain the appropriate safety oversight or insight of the program tests, operations, or activities at a level consistent with mishap potential for the life of the program (Requirement 25094).

3.5.2.8 Establish an independent safety reporting path (see NPD 8700.1, "NASA Policy for Safety and Mission Success") to keep the OSMA apprised of the system safety status, particularly regarding problem areas that may require assistance from Headquarters (Requirement 25095).

3.5.2.9 Support the OSMA independent safety assessment process (e.g., Space Shuttle Pre-launch Assessment Reviews, International Space Station Design and Assessment Reviews) to determine readiness to conduct tests and operations having significant levels of safety risks, and provide real-time safety assessments to the OSMA, when appropriate, while tests and operations are in progress (Requirement 25096).

3.6 Hazard Assessment

The hazard assessment process is a principal factor in the understanding and management of technical risk. Hazards are identified and resultant risks are assessed by considering probability of occurrence and severity of consequence. Risk may be assessed qualitatively or quantitatively. System safety is an integral part of the overall program risk management decision process. A sample format to document the risk process is provided in Appendix E.

3.6.1 Risk Assessment Code (RAC). The RAC is a numerical expression of comparative risk determined by an evaluation of both the potential severity of a condition and the probability of its occurrence. RAC`s are assigned a number from 1 to 7 in a risk matrix (see figure 3.2.). The RAC number will serve as a means to prioritize corrective actions, e.g., RAC 1 is unacceptable and mitigation actions must be taken immediately or operations terminated, RAC 2`s must be addressed before RAC 3`s, etc. (Requirement 25246). Differences between higher number RAC`s (beyond 4) probably cannot be discerned due to low risk levels. The cognizant safety and program officials may approve variations to the matrix.

3.6.1.1 Severity is an assessment of the worst potential consequence, defined by degree of injury or property damage, which could occur. The severity classifications are defined as follows:

Class I - Catastrophic - A condition that may cause death or permanently disabling injury, facility destruction on the ground, or loss of crew, major systems, or vehicle during the mission.

Class II - Critical - A condition that may cause severe injury or occupational illness, or major property damage to facilities, systems, equipment, or flight hardware.

Class III - Moderate - A condition that may cause minor injury or occupational illness, or minor property damage to facilities, systems, equipment, or flight hardware.

Class IV - Negligible - A condition that could cause the need for minor first aid treatment though would not adversely affect personal safety or health. A condition that subjects facilities, equipment, or flight hardware to more than normal wear and tear.

3.6.1.2 Probability is the likelihood that an identified hazard will result in a mishap, based on an assessment of such factors as location, exposure in terms of cycles or hours of operation, and affected population. The following is an example of Probability Estimation:

A - Likely to occur immediately. (X > 10-1 )

B - Probably will occur in time. (10-1> X > 10-2 )

C - May occur in time. (10-2>X > 10-3 )

D - Unlikely to occur. (10-3>X > 10-6 )

E - Improbable to occur. (10-6>X)

(derived from Mil Std 882-System Safety Program Requirements)



Probability Estimate
Severity Class
A
B
C
D
E
I
1
1
2
3
4
II
1
2
3
4
5
III
2
3
4
5
6
IV
3
4
5
6
7

Figure 3.2 Risk Assessment Code Matrix

(See paragraph 3.6.1 for RAC usage.)

3.7 Safety Activity Phases

As presented in figure 3.3, the hazard assessment process begins in the formulation stage and continues, in varying degrees, throughout the program`s life cycle. This involvement begins with the early design concepts. The system safety and mission success hazard analysis effort shall be a continuing and iterative process influencing the system in a manner which manages risk as the design progresses and matures (Requirement 25097).

3.8 System Safety and Mission Success Hazard Analyses

3.8.1 System safety analyses provide a means to systematically and objectively identify hazards, determine their risk level, and suggest the mechanism for their elimination or control. This iterative process begins in the conceptual phase and extends throughout the life cycle including disposal. The extent and depth of analysis required to meet the following five functions will be determined by system complexity and loss potential. Functions supported by the analyses include the following:

3.8.1.1 Providing the foundation for the development of safety criteria and requirements.

3.8.1.2 Determining whether and how the safety criteria and requirements provided to engineering have been included in the design.

3.8.1.3 Determining whether the safety criteria and requirements created for design and operations have provided an acceptable level of risk for the system.

3.8.1.4 Providing a roadmap (or methodology) for the development of safety goals and mission success criteria.

3.8.1.5 Providing a means for demonstrating that safety goals have been met.

3.8.2 During the hazard identification process, it is essential to remain non-judgmental about the associated probability, severity, and corrective action. Once identified, hazards are ranked by severity, probability of occurrence, and program impact (risk assessment). Sufficient analyses are performed to assess the likelihood of occurrence (usually qualitative for early assessments) for each undesired event identified.

3.8.3 There are several types of analyses necessary to identify all the hazards, some of which are specialized and others which, as designs mature, build on previously accomplished analyses.

3.8.3.1 The first safety analysis is the Preliminary Hazard Analysis (PHA), which shall be performed early (Requirement 32126). Other primary analyses shall include the Subsystem Hazard Analysis (SSHA), Component Level Fault Tree Analysis (FTA), Software Hazard Analysis (SWHA) (see NASA Standard 8719.13A, "Software Safety," for more information), System Hazard Analysis (SHA), Operating and Support Hazard Analysis (O&SHA), Job Hazard Analysis (JHA), Human Factors Engineering Analysis, the Safety Requirements Compliance Matrix, and Integrated Hazard Analysis (IHA), unless otherwise indicated by the PHA (Requirement 32127). Data from these analyses can be used to offer recommendations to reduce risks.

3.8.3.2 The hazard analyses should use data developed by other types of analyses when available, such as the Failure Modes and Effects Analysis/Critical Items Lists (FMEA/CIL), Operations Analysis, Human Factors Engineering Analysis, and Maintainability Analysis. The safety analyst may have to develop specific, limited data to support the hazard analyses if the other analyses are not performed. FMEA/CIL analyses support, but are not an alternative to, the system safety analyses in paragraph 3.8.3.1. See Appendix D for further information on these analysis processes and techniques.

3.9 System Safety and Mission Success Program Reviews

The program/project manager or his designated agent shall conduct one or more system safety and mission success reviews depending on the complexity of the system (Requirement 25099). These reviews may be in conjunction with other program milestones. The purpose of these reviews is to evaluate the status of hazard analyses, residual risks, hazard controls, verification techniques technical safety requirements, and program implementation throughout all the phases of the system life cycle. These reviews shall focus on the evaluation of management and technical documentation and the safety residual risks remaining in the program at that stage of development (Requirement 32129).

3.10 Documentation

3.10.1 The system safety task requires creation and maintenance of documentation that provides ready traceability from the baseline safety requirements, criteria, and effort planned in the conceptual phases through the life cycle of the program. All pertinent details of the hazard analysis and review shall be traceable from the initial identification of the hazard through its resolution and any updates, using the continuous risk management approach, until such time in the program as it is no longer applicable (Requirement 25100). Records shall be maintained per NPR1441.1, "NASA Records Retention Schedules" (Requirement 32130).

3.10.2 The SSM shall submit a report to management at each milestone (formulation, evaluation, implementation, or other equivalent milestones (PDR, CDR, DCR, and FRR, etc.)) detailing the results of the safety assessment to document the status of system safety tasks required by the program (Requirement 25101). In the report, the safety analyst shall do the following:

3.10.2.1 List residual risks baselined and potential risks that have yet to be resolved (Requirement 32132).

3.10.2.2 Document management and technical changes that affect the established safety baseline (Requirement 32133).

3.10.2.3 Document and verify adequate resolution of the hazards and obtain written acceptance of the risk from the program/project manager to complete the audit trail (Requirement 32134).

3.11 Change Review

Systems are changed during their life to enhance capabilities, provide more efficient operation, and incorporate new technology. With each change, the original safety aspects of the system could be impacted, either increasing or reducing the risk. Any aspect of controlling a hazard could be weakened, new hazards could be created, or conversely, hazards could be eliminated. Even a change that appears inconsequential could have significant impact on the baseline risk of the system. Accordingly, proposed system changes should be subjected to a safety review or analysis as appropriate to assess the safety impact. HR`s will be updated when required to show any identified risk change (Requirement 25102). Each change initiator shall ensure that safety personnel assess the potential safety impact of the proposed change and any changes to the baseline risk (Requirement 32137). Changes proposed to correct a safety problem shall also be analyzed to determine the amount of safety improvement (or detriment) that would actually result from incorporation of the change (Requirement 32138). There shall be a documented statement of safety impact for every change that is proposed to a program baseline (even if the statement is "No Impact) (Requirement 32139).



| TOC | Change | Preface | Chapter1 | Chapter2 | Chapter3 | Chapter4 | Chapter5 | Chapter6 | Chapter7 | Chapter8 | Chapter9 | AppendixA | AppendixB | AppendixC | AppendixD | AppendixE | AppendixF | AppendixG | AppendixH | AppendixI | AppendixJ | AppendixK | ALL |
 
| NODIS Library | Program Management(8000s) | Search |

DISTRIBUTION:
NODIS


This Document is Obsolete and Is No Longer Used.
Check the NODIS Library to access the current version:
http://nodis3.gsfc.nasa.gov