I may be too trusting, but I generally accept upgrades. Several months ago, I willingly accepted an iPhone operating system upgrade, and lost all the Notes I had stored on my phone. These notes contained bank and credit card details, passport details, and other useful things which I have to consult from time to time, mostly when travelling. The real eye-opener is that I had stored these notes on my phone rather than the cloud, assuming they were more secure and more private because they were restricted to the hardware in my pocket, mine and mine alone. Not so. I was taught a lesson: Apple has the keys to what is in effect my portable office, and can destroy my arrangements at will, or by mere insouciance. They can decide what is best for me.
We are now in the public discovery phase of examining why two new planes have fallen out of the sky, with pilots struggling to stop them diving into the ground. US pilots reported the problem anonymously (as shown above), and the inadequacy of the manual and training was already known. The crashes have happened to foreign airlines, but an unknown risk has been revealed for all passengers to see.
Thank you for the comments on my previous post, particularly those which have found additional material from other aviation sources, and gone into the history of the development of the 737 series. Thanks also for the videos on the general principles of flight. General principles are the foundations of understanding.
I think I was probably looking at aviation websites in November, just after the Lion Air crash on 29 October, and formed the opinion that there was something wrong with the anti-stall system, and told people about it. I might have told anyone willing to listen in November, but I know I discussed this with a test pilot on 22 December 2018. We both recall the discussion, and family members who were present remember the basic points being made. Philip Tetlock ( https://www.unz.com/jthompson/the-tetlock-forecast/ ) will tell you, absolutely correctly, that predictions have to be as specific as possible before they can even be assessed. So, further disclosure: I think I argued the case solely on air-speed indicators, not angle of attack indicators, and did not know or did not include anything about the design change history of the 737 Max series, simply that the Lion Air crash suggested an anti-stall system problem.
This story has it all: the complexities of operator/machine interfaces (mostly a cognitive issue), the intricacies of modern aircraft (mostly a scientific issue with some cognitive aspects) and the compromises involved in the aircraft industry, concerning safety, operating and training costs, and competition between manufacturers (economic and political issues).
My focus is on the cognitive task of flying a plane, and forming an understanding of how systems work and how they must be managed in emergencies. I am also interested in the cognitive aspects of maintaining a plane, fault reporting and correcting. Psychology has a part to play in the discussion of cognitive tasks. For example, what is the natural thing to do when, shortly after take-off, a plane starts diving into the ground? Read a manual? Recall from memory, as the plane lurches ever downwards, what needs to be done? Call to mind the checklist of tasks required to disengage a system which unknown to you has been fooled by an unreliable angle-of-attack indicator? My view is that a cockpit is no place for badly designed IQ test items. Systems have to be adapted to human information processing limitations, and must fit in with startle responses and standard pilot reactions and conventions.
Using James Reason’s explanatory framework (Human Error, 1989), pilots flying the Boeing 737 Max 8 and encountering the opaque workings of MCAS (manoeuvering characteristics augmentation system) are carrying out intentional but mistaken actions: they are trying to pull a plane out of a dive. The plane is in fact climbing away from an airport after takeoff, but a failure in an angle of attack indicator has convinced MCAS that it is in a stall condition. (For extra money, you can buy a second angle of attack indicator, and apparently these two airlines did not do so. For safety, two should be standard at no extra cost). Accordingly, MCAS puts the nose of the plane down to avoid the stall. The pilot reacts by pulling back the yoke so as to resume upward flight, cognizant of the plain fact that unless he can gain height he is going to die, together with his passengers. His action satisfies MCAS for a short while, and then it comes in again, helpfully trying to prevent a stall (because pulling on the yoke is not enough: the whole tail plane has to be “trimmed” into the proper angle). Pilots are doing what comes naturally to them.
MCAS is diligently doing as instructed, but is badly designed, relying as it does in this case on a single indicator, rather than two which could identify and resolve discrepancies, and has no common sense about the overall circumstances of the plane. The pilots know that they have just taken off. MCAS, as far as I know, does not “know” that. Again, as far as I know, MCAS does not know even what height the plane is at. (I know that this is not real Artificial Intelligence, but I used it as an illustration of some of the problems which may arise from AI in transport uses). The pilots respond with “strong-but-wrong” actions (which would be perfectly correct in most circumstances) and MCAS persists with “right-but-wrong” actions because of a severely restricted range of inputs and contextual understanding. Chillingly, it augments a sensor error into a fatal failure. A second sensor and much more training could reduce the impact of this problem, but the inherent instability of the engine/wing configuration remains.
Using Reason’s GEMS system, the pilots made no level 1 slips or lapses in piloting. They had followed the correct procedures and got the plane off the ground properly (once or twice a pilot forgets to put the flaps down at take-off or the wheels down at landing). I think they made no level 2 rule-based errors, because their rule-based reactions were reasonable: they considered the local state information and tried to follow a reasonable rule: avoid crashing into the ground by trying to gain height. They could be accused of a level 3 error: a knowledge-based mistake, but the relevant knowledge was not made available to them. They may have tried to problem-solve by finding a higher level analogy (hard to guess at this, but something like “we have unreliable indicators” or “we have triggered something bad in the autopilot function”) but then they must revert to a mental model of the problem, and think about abstract relations between structure and function, inferring a diagnosis, formulating corrective actions and testing them out. What would that knowledge-based approach entail? Either remembering exactly what should be done in this rare circumstance, or finding the correct page in the manuals to deal with it. Very hard to do when the plane keeps wanting to crash down for unknown reasons shortly after take-off. Somewhat easier when it happens at high altitudes in level flight.
At this point it needs to be pointed out that there is some confusion about how easy it was to switch off CMAS. All the natural actions with the yoke and other controls turn if off, but not permanently. It comes back like a dog with a stick. Worse, it will run to collect a stick you didn’t throw. The correct answer from the stab trim runaway checklist, is to flick two small switches down into the cut out position. Finding them may be a problem (one does not casually switch things off in a cockpit) and for those not warned about the issue, the time taken to find out the required arcane procedure may be insufficient at low altitudes, such as after take-off. Understandably, pilots did not understand the complexity of this system. They had a secret co-pilot on board, and hadn’t been told.
This is the figure I generally depict as slices of cheese with holes in them which sometimes line up.
Safety depends on understanding risks and providing protection in depth. No protective filter is perfect, so several are placed in sequence in the hope that they will trap all but the very rarest events. What is curious about MCAS is that it was given power to assume command. It was not so much defense in depth, but an attempt to overcome an intrinsic defect. It is a slice of cheese placed very early in the defensive array, despite having a massive hole in it, in that it prioritizes one vulnerable indicator, and then operates independently, and against pilot wishes. A chain of engineering and economic decisions led to the 737 Max series, and CMAS might have worked if there had been redundancy in angle of attack and airspeed indicators, and a way of integrating the inputs, and most of all, a way of communicating to pilots what the system was trying to do, and for what reason. The chain of command should have been that the pilot made decisions, and CMAS made suggestions, and asked for permission.
Some of the comments were about the race of the pilots. I do not avoid race differences in intelligence as explanations for human behaviour, but I don’t see that as the most likely explanation in this case. If a system is opaque and not properly communicated to pilots then it is a liability for all pilots. Properly selected pilots should be much alike in this regard. However, I know that airlines differ in safety records, in maintenance standards, and in the reporting and correcting of faults. In my view, Lion Air did not deal with a faulty indicator properly. A friend in UK air traffic control tells that the best airlines deal with issues quickly, and other airlines (often low cost ones) are more likely to log them but fix them later. They tolerate errors and, mostly, nothing much happens. In sum, it would be good to look at the safety profiles of the airlines. Ethiopian Airline is considered reasonably safe, and Lion Air less so, but assessing those issues means looking carefully at pilot capabilities and the general quality of maintenance. (A surrogate may be to look at deaths per road mile traveled in the countries concerned. All this is for another day).
Although the current focus is quite rightly on Boeing, the same general principles apply to Airbus and to all transport systems. For example, cars have safety systems, but do not yet take over control without permission (or at least not until a collision is imminent). Cars could advise, given road and weather conditions, what speed was prudent, or what the current speed meant in terms of stopping distances. A well explained warning system can be helpful, like those that assess the alertness of the driver, and display a warning. Good. They do not take over the steering wheel or put on the brakes.
Flying is safe, which paradoxically makes these crashes even harder to bear. Boeing has an MCAS upgrade in the pipeline. Despite the compromises in the design of the Max series which affect the centre of gravity, the improved system might work, probably by using several sensors provided as standard.
On the other hand, passengers might not be convinced. They may feel they were taken for fools by an arrogant company, and decide to shun the 737 Max, if only to give manufacturers a very clear message: build safety into the design, and keep the controls simple, stupid.