Hello, and welcome to a new edition of Expander. I’m Abhishek, and each week I share practical ideas on doing product, measuring what matters, working with people, and growing a business. Send me your questions and in return, I’ll humbly offer BS-free actionable advice. 🤜🤛
To receive this newsletter in your inbox weekly, consider subscribing. 👇
If you find this post valuable, check out some of my other posts:
Today, let’s talk about NASA. Specifically, the Space Shuttle Challenger disaster, and a few lessons we can learn about how to avoid organisational failures. In so many ways, this post is a spiritual successor (and somewhat a rebuttal) to my previous post on trusting the process. Let’s begin!
The fatal accident occurred on January 28, 1986, when the Space Shuttle Challenger broke apart 73 seconds into its flight, killing all seven crew members aboard. This was caused by the failure of O-ring seals that were not designed to handle the unusually cold conditions during the launch. But, like so many other disasters, the cause of failure isn’t that simplistic.
But before we proceed, we’ll have to understand a wee bit rocket science. You see, O-rings are rubber rings that sit squashed in joints that connect the various sections of the rocket booster. During ignition, burning gas comes shooting down the rocket booster. Meanwhile, the metal walls (that connect to form a joint) pull apart for a split second, at which point the rubber O-rings expand to fill the space immediately and keep the joint sealed. If this goes as described, lift-off is achieved and life goes on along with the rocket.
But… if the O-rings don’t expand instantly to fully seal the joint during ignition, burning gas can shoot right through the rocket booster walls. This behaviour is called “blow-by” and it’s kind of a life-or-death condition. Turns out, cool temperatures can dramatically increase the chances of a blow-by as it hardens the O-ring rubber.
Two pre-Challenger flights had blow-by but they luckily returned home safely. But there were strong signs that cold temperatures aren’t good for launches… and this is where things start to get interesting. The engineers knew about it, but NASA didn’t. This sounds absurd, and to understand why this is absurd, we have to learn a bit about NASA’s culture (in the 80s).
Thanks to an extraordinarily strong technical culture, NASA had developed quantitatively rigorous flight readiness reviews. Managers grilled engineers and forced them to produce data to back up their claims whenever they made any assertions. A framed quote hung in the Mission Evaluation Room: “In God We Trust, All Others Bring Data.”
This ethos and a strong quantitative process had worked remarkably so far. The space shuttle was the most complex machine ever built by human beings, and all twenty-four flights (before Challenger) had returned home safely. There’s no reason to question this rigorous data-driven culture.
But in the emergency conference the night before the launch when the weather reports predicted an unusually cool Florida weather, that same quantitative culture led the team astray.
Remember the two flights that returned safely despite blow-by? Well, an engineer had personally inspected their joints (well before the Challenger launch). The one that launched at 75-degree-fahrenheit had a thin streak of light-grey soot, but it was nowhere close to a catastrophic problem. But the one that launched at 53-degree-fahrenheit had jet-black soot across a large swath of the joint. The cool conditions had hardened the O-rings and made them slow to expand and seal at ignition. This was a big bright red flag, one that NASA completely ignored.
The engineer’s analysis was right, but he did not have the data to prove it. “I was asked to quantify my concerns, and I said I couldn’t,” he later testified. “I had no data to quantify it, but I did say I knew that it was far away from goodness.”
The very process that had made NASA so consistently successful — the original technical culture in the agency’s DNA — suddenly started to work against them when they blindly trusted the process.
The engineer’s concerns were based on a few photographs. Even though the difference was telling, it was a “qualitative assessment”. This was considered an emotional argument in NASA’s culture. It did not conform to the usual quantitative standards, so it was deemed inadmissible evidence and disregarded.
You see, reason without numbers was not accepted at any cost, no matter what. The credo, “In God We Trust, All Others Bring Data” loosely translated to, “We’re not interested in your opinion on things. If you have data, we’ll listen, but your opinion is not requested here.”
Richard Feynman was one of the members of the commission that investigated the Challenger, and in one hearing he admonished a NASA manager for repeating that data from the photographs weren’t conclusive. “When you don’t have any data,” Feynman said, “you have to use reason.”
After the tragedy, many engineers said they agreed with the qualitative assessment from the photographs, but knew they could not muster quantitative arguments, so they remained silent. And their silence was taken as consent. Because if you feel like you don’t have data to back things up, the boss’s opinion is better than yours. After all, you went “by the book” and cannot be blamed.
An effective company culture is usually thought to be consistent — one that promotes self-reinforcing consistency in the same direction — and people like consistency. While hiring, we often rely on “cultural fit” to maintain this consistency. The technical term for cultural fit is “congruence”.
But the most effective companies don’t benefit from congruence. They thrive from “confluence” — when two or more forces come together from opposite directions. The most effective company cultures, and by extension their leaders, are paradoxical. They are demanding and nurturing, orderly and entrepreneurial, hierarchical and individualistic, all at once.
When cultures become too internally congruent, it leads to a “yes man” attitude where following the standard procedure becomes more important than following the correct procedure. We need confluence to build cross-checks and challenge blind congruence.
NASA’s “can do” culture manifested as extreme process accountability combined with collectivist social norms. Everything was so congruent for conformity to the standard procedures that it was detrimental.
An effective problem-solving culture is one that balances standard practice — whatever it happens to be — with forces that push in the opposite direction. As David Epstein writes in Range, “The trick is to expand the company’s range by identifying the dominant culture and diversifying it by pushing in the opposite direction.”
One great example of (introducing a bit of anarchy and) promoting confluence of ideas is what Ed Catmull calls “Notes Day” at Pixar. It started with one simple question: how do we tap the brain power of our people?
When the team announced their desire to hear from the organisation about where there were problems — cultural, procedural, or something else — little did they prepare for the deluge that would follow. 4,000 emails came in, highlighting over 1,000 unique problems, challenges, and opportunities.
The team narrowed down to 293 potential discussion topics — selected according to the criteria: can we imagine 20 people talking about this for an hour? — and then managers whittled the list further, down to 120 topics.
Ultimately, over the course of a single day, 1,059 employees discussed the topics in 171 sessions managed by 138 facilitators in 66 meeting spaces across Pixar’s three buildings.
“Notes Day was a success in part because it was based on the idea that fixing things is an ongoing, incremental process. Creative people must accept that challenges never cease, failure can’t be avoided, and “vision” is often an illusion. But they must also feel safe—always—to speak their minds. Notes Day was a reminder that collaboration, determination, and candour never fail to lift us up,” writes Catmull in Creativity, Inc..
Netflix introduces confluence in the engineering process via a tool called Chaos Monkey. Chaos Monkey is a script that runs continually in all Netflix environments, causing chaos by randomly shutting down server instances. Thus, while writing code, Netflix developers are constantly operating in an environment of unreliable services and unexpected outages.
But this chaos not only gives developers a unique opportunity to test their software in unexpected failure conditions, but incentivises them to build fault-tolerant systems to make their day-to-day job as developers less frustrating and the Netflix servers more robust.
Before Challenger, there was a long span when NASA’s culture harnessed confluence. When Apollo 11 first landed on the moon, NASA lived by that same mantra, the valorised process — “In God We Trust, All Others Bring Data”— but also had a culture of seeking opinions of technicians and engineers at every level of the hierarchy. If the same hunch was heard twice, it didn’t take data to interrupt the usual process and investigate.
NASA found a way to balance its rigid process with an informal, individualistic culture that encouraged constant dissent and cross-boundary communication. Akin to Pixar’s Notes Day, NASA had “Monday Notes” where every week engineers submitted a single page of notes on their salient issues. Managers handwrote comments in the margins, and then circulated the entire compilation. Everyone saw what other divisions were up to, and how easily problems could be raised. Monday Notes were rigorous, but informal. But alas… this beautiful process got lost when management changed.
In the later days, Monday Notes was transformed into a system purely for upward communication. No feedback was given, and the notes did not circulate. In fact, managers grew angry when they learned of problems. At one point it morphed into standardised forms that had to be filled out. As soon as Monday Notes became just another rigid formality in a “process culture”, the quality of the notes fell.
Companies need a creative culture, not a process culture. A creative culture is built on confluence, not congruence. A creative culture welcomes healthy conflict and controlled chaos. It benefits from serendipity. It’s a culture that is both top-down and bottom-up. It’s a culture that is dynamic, and isn’t afraid of challenging itself. It’s a culture that is neither too cocksure nor too doubtful of itself. A culture that taps into the power of its people and constantly evolves, and gets better with time.
Talk to Me
Do you agree with what I said, or do you think otherwise? Send me counters, comments, questions, and other ways to run a business. 🙌
Until next week,
P.S. I also write The Sunday Wisdom, a weekly newsletter that challenges the norms and learned beliefs about how the world works. Delivered every Sunday at 6PM IST.