Though a "creative" by trade, I pride myself in being highly analytical. I love reading about the psychology of decision making and learning to identify logical fallacies in order to make the best possible judgments.
The book Thinking Fast and Slow by Daniel Kahneman is a fascinating read that has changed how I view the world and make decisions. In his book, Kahneman explores the the biases of our intuition, and teaches the reader how to identify and overrule intuitive hunches in favor of statistically valid reasoning.
A prominent motif of Kahneman's book is the notion that humans are poor intuitive statisticians. Rather than relying on probability—base rates, sample sizes, likelihood—humans tend to make predictions in terms of plausibility: what scenario results in a coherent story? Throughout the book, Kahneman lays out numerous heuristics, biases, and principles that demonstrate this phenomenon. I'd like to share 3 such concepts that have really stuck with me.
1. Base rate neglect
Consider the following example:
Jon is a quiet, bookish type. He is somewhat shy, but enjoys lively intellectual debates about philosophy and economics. Those who know him well describe Jon and genuine, warm, and helpful.
Do you think Jon is more likely to be a librarian, or an insurance salesman?
I just made this example up, but most people would determine that Jon is most far more likely to be a librarian. This makes coherent sense: "bookish" and "quiet" fit the profile of the prototypical librarian, while "genuine" and "shy" run in contrast to how we usually think of insurance salesmen.
However, there are about 4 times as many insurances salesmen as librarians in the US. When you bring Jon's gender into the equation, that disparity (male librarians vs male insurance salesmen) drastically increases. Even though Jon fits the profile of a librarian, the base rate indicates that he is far more likely to be an insurance salesman. So how should we make a judgment given this conundrum?
It is tempting to coast along and accept the more coherent, prototypical story as the most likely truth (eg, Jon must be a librarian). However, we can venture a more accurate guess by anchoring our judgment using base rates (instances of male insurance salesmen vs librarians), and adjusting that judgment using our intuition given the evidence of a particular case (Jon's personality). This method prevents the temptation to let a coherent story overpower our decision making.
2. The law of small numbers
Incidence of renal cancer is lowest in rural, mostly Republican, sparsely populated counties in the US Midwest and South. Hearing this fact leads us to believe that something about the average lifestyle and environment in these counties is conducive to healthy kidneys.
But there's a twist: the highest incidence of renal cancer was also recorded in rural, mostly Republican, sparsely populated counties in the Midwest and South. Surely the lifestyle and environment can't cause both high and low incidences of renal cancer.
Republican politics, geography, and rural landscape have nothing (or at least, immeasurably little) to do with rates of renal cancer. The only important attribute of these counties is that they are sparsely populated. This is due to the Law of Small numbers: small sample sizes are more likely than large samples to yield extreme results.
Suppose there is an urn of marbles in which 90% of the marbles are white and 10% are red. Timmy and Joey take turns drawing marbles; each time Timmy draws 8 and Joey draws 4. Over infinite trials, Joey will draw a far higher percentage of "extreme" hands; the likelihood of drawing all red marbles is extreme for either boy, but is an exponentially more rare event for Timmy: 0.1^8 compared to 0.1^4.
This example translates perfectly to the renal cancer problem: small populations are more likely to "draw" more extreme hands. It is a mathematical fact, but is easily confounded by our intuitive urge to jump to more coherent sounding causal conclusions. The statistical argument is clear as day when laid out explicitly, but is often completely overlooked when a compelling, causal explanation presents itself.
3. Regression to the mean
For most of my life, I've been a competitive ski racer. If you've ever watched ski racing, you might know that it consists of two timed runs. The fastest combined time of the two runs wins the race.
For the sake of this example, let's say that a racer's skill level has plateaued. Her performance in any run is the sum of her (constant) skill + her (variable) luck. If the racer has a remarkably good first run—that is, if she places higher than she normally does, or beats racers that normally beat her—she is most likely going to have worse performance during the second run. This is necessary to maintain the notion of her average performance, which is grounded by her constant skill level.
The racer's luck in given run can be imagined in terms of a bell curve: most of the time the racer will have average luck, some of the time she will have moderately good or moderately bad luck, and on rare occasions she will have extremely good or extremely bad luck. These instances are a function of nothing but chance. Instances of good luck will be balanced by equal and opposite instances of bad luck. Although other variables may play a role, regression to the mean is responsible for the lion's share of this variability.
Regression to the mean is a critical and often confusing concept toward making sound judgments, and failure to acknowledge regression to the mean can be costly. The sports world lends excellent examples for observing regression to the mean. If a basketball coach insists on changing the team's tactic to emphasize passing to one player because he has a "hot hand", that decision could cost the team the game when that player's performance regresses to the mean. Or if a football team drafts a rookie due to his outstanding performance in the previous season's playoffs, they are likely to be disappointed when he inevitably regresses to the mediocre performance he had displayed during the regular season. Although it does not paint a glamorous picture, it is important to consider regression to the mean when evaluating series of events where skill is constant and luck is variable.
Our minds are exceptionally adept at deducing causal relationships. It satisfies our urge to understand a problem coherently and rationally. But as these few examples demonstrate, our craving for coherence and apparent logic is often paradoxical: thinking statistically, rather than intuitively, often leads to more accurate judgments. When making judgments and decisions—especially when the stakes are high—it is important to think like a statistician, or risk falling into these 3 traps, and many more.