Good Investigators Don’t Search for the Who

Part One of Three

A Four Minute Read

In her excellent book The Crash Detectives, journalist Christine Negroni surveys a century’s worth of aviation disasters, and the investigations into their causes. One particularly compelling story is of a 2009 problems
medical evaluation; conditions were so bad that after four missed approaches into Melbourne, the ultimate solution was to put the plane down in the Pacific. It’s a harrowing tale of too-few life jackets and a pilot who spent ninety minutes swimming around like a border collie to keep the passengers together before a fishing boat finally found them. Everyone survived.

By almost any measure, Captain Dom James was a hero that night; to one man at the Australian Civil Aviation Safety Authority, he was a goat. CASA director of safety John McCormick told investigators that the evening’s events were “entirely the fault of the captain.” Subsequent investigations – including a particularly scathing television report – harshly exposed that lie, claiming that McCormick “attacked” Captain James, calling his inquiry “manifestly unfair” and accusing it of covering up key facts.

Source: Australian Broadcasting Company


This story resonates because it centers on a bias I’ve confronted so many times as a manager and troubleshooter: When something goes wrong, the initial instinct is to focus on who screwed up. Too many root cause analyses have spiraled into a “who do we fire” exercise, where a “successful” resolution is indeed some poor sap’s termination.

To say this is human nature is beyond my expertise, but it is remarkably common. Gene Krantz and his colleagues hanksfrom the Apollo XIII mission are routinely cited as the gold standard in applying critical thinking during chaos; even so, in Ron Howard’s film, Jim Lowell’s first line of dialogue after the onboard explosion is, “What did you do?” Our nature to pursue a specific bad actor with a Javert-like mania, as CASA’s McCormick did, is as harmful as it is natural.

Why ‘Harmful’?

A full reckoning of that flight showed a number of factors that contributed to the crash. Captain James had insufficient tools to create and file his flight plan (which included the availability of accurate flying conditions); safety regulations excused the need for air ambulances to carry enough fuel to divert in case of inclement weather; he topped the tanks on the assumption he would be able to fly in airspace conducive to more conservative fuel consumption, but the aircraft lacked the necessary equipment to enter that corridor. Reminiscent of the Continental Connection 3407 crash in Buffalo, NY, earlier that year, crew scheduling rules led to significant pilot fatigue.

Because of his preconception of the captain’s guilt, McCormick omitted all of those details from the official investigation. javertIn her book, Negroni quotes a 2007 study: “’The assignment of blame artificially and prematurely restricts the investigation process’ and can even stop the investigation in its tracks.” This is confirmation bias, and it is an all-too frequent story: Once a team has locked onto a pet cause, the analysis becomes an exercise in collecting incriminating evidence, at the expense of finding alternate theories.

Captain James is not blameless in this story; ultimately, he was responsible for putting that plane in the sky. The consequences to him of not flying (fighting his bosses, mucking through red tape) were more exhausting then his perceived risk. And therein is the tale that my investigation teams have dealt with over the last two decades, whether we were trying to figure out how Eddie Vedder got lost to why 38 trailer loads of finished product were rejected by the retailer to whether or not that overheating on the reactor’s rods was something to worry about.

It’s Not the Who

At their core, all of these inquries incorrectly focused on finding out who the guilty party was: Who was the last the-who-doing-tommyperson to see Eddie, who ran the tests on the raw material, who was supposed to call who in the control room. But the heart – the root cause – of almost every legitimate problem does not rest with bad employees, but with systems and leaders.

Most managers scoff at that, and then outright laugh at this: Employees don’t want to screw up. Almost no one gets up in the morning and thinks, “Today’s the day I get to act in total disregard for my coworkers, bosses, customers, and equipment!” It all comes down to Binney’s Second Law: Everything is either a “leadership problem” or a “training problem”.

If employees have never been trained in how to do something, or how to do it, it is tough to nail them for getting it wrong. Even then, research shows that lack of training is the cause of less than one-sixth of operational problems. The rest come from leadership issues:

  • Do employees truly understand what is expected of them in different situations, and why that’s the expectation?
  • Do they have the auxiliary tools and resources to achieve those expectations?
  • How does their world react when those expectations aren’t met?

The answers to these questions are entirely in leadership’s control. In my next post, I will talk about the importance of setting – and confirming – clear expectations (Spoiler alert: The first step is to actually know what those expectations are!). A second post will address how to investigate the environment – and begin creating a system where success becomes inevitable.

crash-detectivesIt is fitting that the first 80 percent of Crash Detectives is consumed with research in aircraft design flaws and conspiracy theories, before looking at pilot error. Negroni interviews John Lauber, a psychologist and former member of the United States’ National Transportation Safety Board: “All human performance takes place in a context,” he says. The systems, technology, support, and training that surrounds everyone – from airline pilots to bellhops – shape and guide our behavior.

Without understanding that human context, most issues are impossible to fully resolve. Make no mistake – operator error is a real thing. But to close the book and say “operator error” is the root cause is a mistake; savvy troubleshooters seek the cause-behind-the-cause and set the appropriate context.