Effective system safety and emergency management efforts require learning from failure, and from success. Lessons learned will be presented here, often illustrated through an accident or incident. Note that in discussing these events, the intent is not to oversimplify the conditions that led to the incidents or to place blame on individuals and organizations. Rarely is there only one identifiable cause leading to the accident. Accidents and incidents are usually the result of complex factors that include hardware, software, human interactions, procedures, and organizational influences. Readers are encouraged to review the full investigation reports referenced to understand the often complex conditions that led to each accident discussed here.
Pipeline Rupture in Saskatchewan
On April 15, 2007, an oil pipeline rupture occurred downstream of a pump station near Glenavon, Saskatchewan. The rupture caused the release of 990 cubic meters of crude oil in a wetland area. No injuries were reported. According to the Transportation Safety Board of Canada (TSB), the rupture was the result of pipeline corrosion and failure. The original construction of the pipeline allowed the use of polyethylene tape coating over welds. In one location the coated “tented” over a longitudinal weld, allowing the surface underneath the tape to come in contact with corrosive material. Enbridge, the pipeline owner, had developed a crack management program to address the possibility of pipeline failure due to fatigue cracking. This program included monitoring, inspections, and conducting engineering analyses to estimate remaining life. In February 2002 Enbridge had excavated the portion of the pipe that failed in April 2007. Once excavated, the company performed nondestructive testing of the pipeline using magnetic particle and ultrasonic examinations. This testing showed four anomalies, including cracks. The company used a crack growth model to estimate the expected pipe lifetime. However, according to the TSB, the input values used for that model were not appropriate. As stated in the TSB report, “the input values to the crack growth model did not accurately reflect the uncertainties in measured values and predicted growth was less than actual growth.” The analysis therefore underestimated the actual crack growth rate and overestimated the pipe lifetime. Following the accident Enbridge made a number of changes to its crack management program, including revisions to its fatigue analysis procedures.
Lesson Learned: A model is a physical, mathematical or otherwise logical representation of a system, entity, phenomenon or process. Some uncertainty is to be expected, but the credibility of those models must be established prior to use. Even valid models and simulations can be misused if those using the models do not understand the limitations of the model or are not trained to use those models. Insufficient and inaccurate modeling is a hazard cause that should be considered in system safety analyses.
Transportation Safety Board of Canada, Pipeline Investigation Report, Crude Oil Pipeline Rupture, Enbridge Pipelines Inc., Line 3, Mile Post 506.2217 Near Glenavon, Saskatchewan, 15 April 2007, Report Number P07H0014, July 16, 2008.
Grounding of CSL Thames
On August 9, 2011, the bulk carrier CSL Thames, grounded in the Sound of Mull, between the island of Mull and Scotland. The hull of the ship was damaged and one of the ballast tanks flooded. There were no injuries reported. The Marine Accident Investigation Branch (MAIB) found that just prior to the accident the third officer had altered the vessel’s course to avoid another vessel. The altered course took CSL Thames into shallow water. The Electronic Chart Display and Information System (ECDIS) displayed that the ship was in danger, but the third officer did not see the visual warning because he was focused on avoiding a collision. The ECDIS was also equipped with an audible alarm. However, the alarm did not sound. The MAIB later found that the ECDIS unit was not connected to a loudspeaker or buzzer which would have been capable of sounding the alarm. The MAIB report also noted that loud music was often played on the bridge, and the alarm may not have been heard even if the system had been functioning. The report went on to say that the crew failed to question the absence of an alarm; as stated in the report, “This indicates a lack of understanding of the equipment’s safety features and/or their value.”
Lessons Learned: A significant benefit of strong risk assessment efforts is communication of risks both within an organization and to external entities. Operators must understand the hazards and the purpose of the hazard controls. If the risk is not properly presented and communicated then risk reduction measures may not be implemented, or employees and private citizens may make poor risk decisions. For example, mitigation measures may be ignored or disabled, as was the case in this incident. Proper communication of risk allows those operating a system to understand the hazards, implement measures to protect themselves and their coworkers, and take appropriate actions in an emergency.
Marine Accident Investigation Branch, “Grounding of CSL THAMES in the Sound of Mull 9 August 2011,” Report No. 2/2012, March 2012.
Train Derailment in Virginia
On January 5, 2006, a northbound Virginia Railway Express commuter train derailed at Possum Point near Quantico, Virginia. Seven passengers and two crew members required medical attention after the accident. The National Transportation Safety Board (NTSB) report into the accident stated that the probable cause of the accident was an excessively worn and chipped switch point. The condition of the switch point caused the fourth passenger car to derail. The portion of the track that led to derailment had been inspected monthly, per requirements. Recent inspections identified that this switch point had been worn and/or chipped. A vendor had been contacted to replace the switch point. While waiting for the replacement parts, the switch point was welded as a temporary repair. The NTSB report stated that other places on the track had shown similar wear, and speed restrictions were put in place for trains traveling on this part of the track until replacement parts could be procured. The report stated that such speed restrictions were not implemented at Possum Point. The report noted that the replacement parts arrived on the day of the accident.
Lessons Learned: In the case of the VRE incident, welding repairs and speed restrictions were intended as workarounds to compensate for failures in safety-critical systems. Care should be taken when deciding to use workarounds to mitigation measures. Such workarounds typically require the use of operating procedures, which are generally less effective than engineering solutions. These workarounds become even less effective when the procedures are informal.
U.S. National Transportation Safety Board, “Derailment of Virginia Railway Express Train, Quantico, Virginia, January 5, 2006,” Railroad Accident Brief RAB-06-06, November 20, 2006.