As we continue to use advanced technology and more automation, our transportation systems, energy production systems, medical devices, manufacturing processes, and many other systems continue to increase in complexity. These complex systems create safety risks to their operators and to the communities they serve. System safety is an approach to manage hazards and risks in complex systems.
Many industries use system safety analyses and methodologies, by convention or as required by regulation, to help reduce the potential for harm to people, property, and the environment. For example, system safety analyses are being used to identify hazards and risks in the design and operation of:
Process safety management is a special case of system safety applied to hazardous chemicals, regulated in the United States through the OSHA Process Safety Management standard.
System safety and process safety management efforts play an important role in identifying, characterizing, and controlling hazards and reducing risk. A typical system safety process includes:
Tools and techniques used to perform the system safety analyses include:
Preliminary Hazard Analysis
Fault Tree Analysis
Hazard & Operability (HAZOP) Analysis
Event Tree Analysis
Failure Modes and Effects Analysis (FMEA)
Cause/Effect Analysis
What-If Analysis
Functional Hazard Analysis
Human Error Analysis
Software Hazard Analysis
Because system safety and process safety management cannot prevent or mitigate all risks, we need systematic and rational approaches to planning for and responding to emergencies. That systematic approach is known as emergency management. Emergency management typically includes the following key components:
Identification and assessment of hazards and risks are also necessary elements of an effective emergency management program.
The level of effort required for system safety, process safety management, and emergency management activities are typically dependent on the complexity of the system and operations and the nature of the hazards.
Some broad tenets of system safety include the following (Leveson, Ericson, Hardy):
System safety emphasizes qualitative analyses over quantitative analyses. Early in the development of a system quantitative data may not be available. Qualitative analyses allow for prioritization of problems early in the development cycle to allow intelligent design trade-offs. This does not preclude quantitative analyses. As the development proceeds, quantitative data and probabilistic estimates can provide information to verify the reduction in risk. However, most of the focus of system safety is on qualitative analyses.
Safety is part of a series of tradeoffs. Safety is only one of many tradeoffs made in the development of a system. Typically, safety, cost, schedule, technical capabilities, and (at times) political considerations all help determine an acceptable design. We make trades based on whether the benefit we obtain from an activity exceeds the potential for harm. Creating value in any endeavor requires taking risks, and all risks cannot be eliminated, but risks must be managed.
System safety efforts require dedicated efforts and do not occur by chance. Safety of complex systems requires that safety be designed in. Efforts to design in safety require a dedicated system safety program, with appropriate resources. The system safety program must be a formal program using a disciplined approach and dedicated personnel to be effective.
The absence of accidents does not necessarily mean that the system is safe. Just because accidents have not occurred does not mean that they won’t. A lack of accidents could imply that an organization without a system safety program is just lucky. Accidents are not prevented by luck but by a disciplined safety effort. Effective system safety programs understand that risk is ever present and unresolved hazards will eventually manifest themselves.
System safety efforts require team participation. No single person can understand a complex system or all its components. Multiple participants from different technical disciplines are needed to assure completeness of the effort. Teams should include representatives of safety, environmental, and line management to help reduce uncertainties and redundancy of analysis activities. Subject matter experts are necessary to help in the identification of hazards and approaches to effectively control those hazards identified.
System safety and emergency management work together to create resilient systems. While system safety tries to build in safety, the discipline recognizes that in spite of our best efforts there will be ways that things go wrong because of many different factors, some beyond an organization's control. Therefore, system safety not only tries to prevent an accident but also tries to find ways to prevent a bad situation from becoming worse and recover once an event has occurred. In strong safety management and engineering efforts, system safety and emergency management disciplines and personnel are integrated to assure continuity of operations after a major mishap or in the presence of continuous stress.
Leveson, N., Safeware: System Safety and Computers, Addison Wesley, 1995.
Ericson, C., “The First Principles of System Safety,” Journal of System Safety, March-April 2009.
Hardy, T.L., Emergency Planing and Response: Case Studies and Lessons Learned, BookLocker, 2013.
System Safety: the application of special technical and managerial skills to the systematic, forward-looking identification and control of hazards throughout the life cycle of a project, program, or activity. The objective of system safety is to prevent accidents.
Process Safety Management: the application of management principles and systems to the identification, understanding, and control of process hazards to protect employees, facility assets, and the environment.
Emergency Management: the coordination and integration of all activities necessary to build, sustain, and improve the capability to prepare for, protect against, respond to, recover from, or mitigate against threatened or actual natural disasters, acts of terrorism, or other man-made disasters.