Episode 8 – The Occurrence of Failure: The Ethiopian Airlines Disaster
Source: The Associated Press
This episode addresses the second crash within four and a half months of a Boeing 737 MAX 8 aircraft - that of Ethiopian Airlines Flight 302.
What's amazing in this part of the MAX 8 story is that new calculations demonstrated that the risk of another crash was significant. However, leaders across the airline industry couldn't break their existing mindset and failed to ground the MAX.
What's also amazing is that the crew, who now knew about MCAS, lost control of the aircraft nevertheless.
What's less known, however, is that the Ethiopian cockpit emergency had two phases - MCAS wasn't even involved until the second phase. The whole story is captured here in this episode.
Please share your views, insights, and opinions through the MAX 8 Podcast Comments form. Episode 12 will be dedicated to feedback from listeners such as you.
-
EPISODE HIGHLIGHTS:
(1:07) – Completing the post-crash events of Lion 610.
(3:06) – A new view on risk (the TARAM assessment).
(8:05) – A sanguine view persists.
(11:15) – What was happening here?
(16:05) – Phase One of the cockpit emergency for Ethiopian Flight 302.
(26:47) – Phase Two of the cockpit emergency.
(33:24) – Key points about causality in this story.
(37:59) – Immediate actions after the crash.
(38:57) – Conclusions on three flight emergencies and two crashes.
(43:46) – Lessons regarding the Framework of Failure.
KEY POINTS:
Responses to the Lion Air crash, and a new risk assessment.
Notably, the FAA developed a more concerning view on risk based on a "TARAM" assessment which estimated that, on average, 15.4 accidents would likely occur over the 30 to 40-year assumed lifespan of the entire MAX fleet, leading to a loss of 2,921 lives. This means a crash would occur once every two to three years, roughly equating to a 25% probability of MCAS failure causing a crash in the eight months leading up to the planned June 2019 software update.
No senior FAA official claims to have seen this analysis. However, it is one of the story's critical turning points, along with MCAS’s “design drift” from the wind-up turn situation to the low-altitude situation. Risk experts point out that, in general, such analyses are best approached as rough guides that make a qualitative perspective based on rough assumptions appear quantitative – that is, more precise than reality warrants.
Despite new data, everyone defended their position, doggedly holding on to prior perspectives and decisions. Weak signals, some becoming louder, were being ignored. Everyone wanted to keep the airplanes in the air —the airlines, the civil aviation agencies, and Boeing.
This and a number of other responses to failure discussed in this episode set the stage for another failure. They allowed the same socio-technical arrangement to continue. The only difference was that pilots now knew about MCAS but not what it did or how it worked or behaved.
Ethiopian Flight 302.
This time only four MCAS activations occurred—and only three of these moved the stabilizer—but we now know that even two are enough for disaster. We also know that MCAS had received significant attention in the press and the aviation community. The crew had also been given information about MCAS, although it is uncertain whether this information was presented in a way that ensured they had thoroughly read and correctly understood the procedures.
Regardless, this cockpit emergency proved to be much more complicated and ultimately more politically contentious than the one faced by Lion Air’s pilots.
For the Ethiopian crash, a new element was a crucial problem involving the autothrottle that was unique and at the center of the occurrence. Here, the engine was locked into full thrust because of a series of separate component interactions tied to the failed AOA sensor, which collectively interacted to create an unstopped failure. We have here what organizational sociologist Charles Perrow famously labeled a “normal accident” – “normal” in that disasters from bazaar technological interactions are, by nature, inevitable, even if they are rare. This dynamic involves the “tight coupling” of separate components and an “interactive complexity” in those connections that cannot be seen coming and unfold too fast for people to stop.
So what happened?
First, the pilots knew about MCAS and the new procedural instructions for runaway trim now associated with MCAS failure. Unlike Lion's pilots, MCAS was not a total mystery to them. However, the procedural instructions could have been misunderstood or poorly communicated.
Second, Ethiopian Flight 302 experienced two discrete phases of crisis, both tied to an AOA sensor failure. The first involved the autothrottle acting dangerously based on bad sensor data combined with the pilots' failure to act appropriately to disengage it as they were trained to do. The second involved MCAS activating at a very perilous point in the crisis, compounding the problem and quickly narrowing the window for recovery.
Third, they struggled to process mentally what was happening. It began with a dissonance of alarms in the cockpit even before MCAS engaged—similar to Lion. Then, when they retracted the flaps, MCAS triggered. This is not typical of runaway trim. While the pilots vaguely knew about MCAS, they did not know the conditions that make it operational or the experience they would face. Therefore, they did not act immediately to CUT OUT the system.
Fourth, the pilots appeared to lose situational awareness, allowing the airplane to continue at full throttle after commanding a slower speed – the pilot did not recognize that the airplane had not slowed down.
Lessons.
First, the Boeing MAX story offers us a unique opportunity regarding the Occurrence of Failure because two disasters and a third flight emergency occurred based on the same structure and triggering event – MCAS’s design and a failed AOA sensor.
Second, a failed socio-technical design—its apparatus—can be realized as an accident or disaster in multiple ways—in fact, in quite different ways. For Lion, the triggering event was a replacement AOA sensor; for Ethiopian, it was a bird strike. Lion's emergency was fully centered on many MCAS activations and the failure of the pilots to respond to trim movements fully. Ethiopian's emergency was more complicated due to the two-phase sequence of autothrottle problems followed by MCAS activations.
Third, it is only through a disaster or near disaster that design problems might be identified. Lion 610 demonstrated that the assumption of pilot reaction to runaway trim was incorrect. Ethiopian 302 highlighted much more starkly the issue of not controlling for airspeed.
THING YOU CAN DO:
Let me know your thoughts.
Please share your views, insights, and opinions. Episode 12 will be dedicated to feedback from listeners such as you.
You can contact me through the MAX8 Podcast Comments form. While I may not be able to respond to all comments, I will read each one carefully. I’m very interested in your thoughts.
Download my Framework of Failure description.
The Framework of Failure is summarized in a six-page PDF that can be downloaded. Access is at the bottom of the Home Page at BradIvie.com.
Subscribe for updates and announcements.
Please sign up to receive periodic email communications from me, primarily announcements of new podcast episodes and (in the future) blog posts. The signup form is at the bottom of all web pages on BradIvie.com.
Share this episode with friends and colleagues.
This podcast is created for many audiences: business professionals, management consultations, aeronautics industry professionals, aviation enthusiasts, policymakers, and the general public. Please share this episode.
with those who you feel would be interested in this story and benefit from the information provided and the analytic approach taken. Or perhaps the video trailer for the series.