Penguin/Allen Lane 2011
I first met Daniel Kahneman in May 2001 when I was researching my still-unpublished book about probability. I had expected to visit him at his office in Princeton’s economics department, but for some reason I can no longer remember, Kahneman said the interview should take place at his suburban house.
I was there to ask him about the work he was already famous for—his research with Amos Tversky on heuristics and biases. Patiently, Kahneman answered my questions. But he also wanted to tell me about what were then his current interests.
“What we get, which is really very beautiful—this is what I’m writing about—what is becoming increasingly influential in psychology, is what is called a two system view. You have one system that is intuitive and very quick, and is perception-like, and you have another system that does formal reasoning and logical operations”.
I didn’t pay much attention to those words at the time. A decade later, the zeitgeist has caught up with Kahneman, helping to make his book a best seller. And it deserves to be. Ever since the late Peter Bernstein published Against the Gods, Kahneman has been anthologized and popularized by many writers. Far better in my opinion is to read an accessible book on Kahneman’s work by the man himself.
It’s a testament to the importance of Kahneman and Tversky’s work that much of their thinking is now part of economic orthodoxy. Policymakers around the world are implementing their discoveries about biases and decision-making in areas such as social security and consumer protection.
Kahneman would say that the dominant position of behavioural economics today is the result of hundreds of scientifically sound experiments and observations that have proved him right and his critics wrong. For more than two decades, Kahneman (and Tversky before his premature death) ran the gauntlet of fierce, sometimes vicious attacks in academic journals and conference debates. Like the Israeli army that launched his career as a psychologist, Kahneman struck back hard at his enemies. He invented a concept called “adversarial collaboration’ to make peace with some of them. In his book he looks back on some of these battles from a victor’s perspective.
One wouldn’t want (or dare to try) to take away from Kahneman such a hard-fought victory. However, given his work’s increasingly central position among economists and psychologists today, and its embrace by policymakers, it’s important to understand what the limits are.
Then there’s the commercial importance of data-driven analysis of human behavior, made possible by computing power, smartphones and the internet. Here, Kahneman’s text is quoted by technology writers as providing a business case for “big data’ ventures (see for example, So, What’s Your Algorithm? by Dennis Berman in the Wall Street Journal, Jan. 4, 2012). I’m not sure that Kahneman himself is comfortable wearing such clothes, but with so much money and social transformation at stake, a critical eye is also needed.
Let’s start with the book itself. In writing it so many years after the area of heuristics and biases became a hot topic outside the ivory tower, Kahneman acknowledges the challenge of competing with his own followers and popularisers. He’s also been preceded into print by researchers whom he would consider his peers in the field, notably Richard Thaler and Cass Sunstein with their widely-read book Nudge.
Kahneman organises his book into sections based around three successive “two-system” themes. Firstly we have the fast vs. slow brain dichotomy of the book’s title, or System 1 vs. System 2 as it is described in the text. Then comes the “humans vs. econs” duality that Thaler and Sunstein already popularised in Nudge. Finally, Kahneman introduces the fascinating idea of the “experiencing” vs. “remembering” self.
Humans, smart and dumb
The fast/slow brain dichotomy, which Kahneman mentioned to me in 2001, may have roots in the hostile climate in which his work with Tversky was originally received. According to their critics, the exposing of human biases and poor decision-making amounted to misanthropy, waylaying experimental subjects with trick questions in order to “prove’ that people were “dumb’.
For Kahneman, his fellow psychologists not only disliked the message, but also begrudged his and Tversky’s success at identifying so many cognitive illusions. “I think there was a bit of envy because flaws are more interesting to talk about¡, he told me.
Leading the attacks against Kahneman was the German psychologist Gerd Gigerenzer, whom he acknowledges in the book’s footnotes as “our most persistent critic”. Rather than build a career exposing flaws in human reasoning, Gigerenzer’s research focused on extolling human decision-making virtues, with titles such as “Simple Heuristics That Makes Us Smart”.
While researching my probability book in 2000-1, I was told about the feud between Kahneman and Gigerenzer by academic psychologists. They spoke of it in awed terms as a kind of gladiatorial death-match that could only result in absolute victory on one side and abject defeat on the other.
When I asked Kahneman about the feud, it was clear that he was troubled by it. “It was embarrassing, the level of hostilities¡, he said. It galled Kahneman that Gigerenzer outmatched him in rhetorical skills, gaining traction for his false arguments. ´I don’t know if you’ve ever heard Gigerenzer, he speaks very well”, Kahneman complained. ´Even when he’s completely wrong, it’s hard not to be impressed. I’ve debated him a couple of times”.
While Kahneman felt he had intellectually trounced Gigerenzer at the time of my interview, he made it clear that he wanted to draw the poison out of the conflict and move on. At the point of victory, one might even say he was politically astute in wanting to cut short the death-match, rather than fall into the misanthropic trap of dancing on the corpses of “nice guys”. The fast/slow brain dichotomy provided that route, as is clear to me from reading Kahneman’s book.
That’s why the “Two Systems” section of his book is placed non-chronologically before the recounting of Kahneman and Tversky’s most famous discoveries such as the law of small numbers, anchors, framing, availability, representativeness, conjunction and validity. Rather than saying that humans are outright schmucks, at least when compared with economic rational actors or artificial intelligence systems, Kahneman pins the cognitive flaws on quick-thinking System 1.
Effortlessly substituting an easier question for the one it’s supposed to answer, System 1 needs the assistance of rational, deliberative but slow System 2, which tends to be too lazy to help out when needed. It’s a nice theory, and Kahneman marshals evidence to support it, although one might argue that drawing the line between Systems 1 and 2 is too context-dependent to have predictive value.
It’s certainly true however that the repackaging of his old work gives Kahneman a much more positive message: like a self-help book, he sugars the famous biases with a message of hope for readers to evade the System 1 traps and get their System 2 working. In one sense Kahneman has moved closer to Gigerenzer, who in his native Germany has succeeded in having probabilistic reasoning taught in the national curriculum to children from kindergarten age.
And he’s careful to defuse the old Gigerenzer hand grenade by stressing that System 1 isn’t always stupid. Consider Kahneman’s account of his “adversarial collaboration” with psychologist and consultant Gary Klein. Readers of Malcolm Gladwell’s Blink might remember Klein as a key source in that book, whose work with firefighters and pilots highlights the power of rapid-fire intuitive judgement. In other words, Klein is a fan of System 1.
With Kahneman renowned for debunking intuition as a snakepit of biases, you can imagine him and Klein becoming implacable enemies in the way that only academics can. However, after Kahneman suggests they work together, the two psychologists diplomatically agree to draw a line between where intuition works and where it doesn’t. Kahneman concedes that System 1 works well for the short-term decisions of firefighters and surgeons; Klein agrees that it doesn’t work for long-term psychological assessments or expert judgements.
Another strategic alliance forged by Kahneman is the one with Nassim Taleb. Compared with the adversarial collaborations, this feels more like a quid pro quo: Kahneman extols Taleb’s insights in the text, and Taleb provides a glowing endorsement on the dust jacket. Of course there’s more to it that that: Kahneman welcomes Taleb into the fold of behavioural psychologists in a discussion of what he calls the illusion of understanding, where he pays tribute to Taleb for discovering the “narrative fallacy”, which first made an appearance in Taleb’s 2001 book Fooled By Randomness.
That book was published at a time when fund managers and finance pundits were more esteemed than they are today. Taleb deserves credit for helping to puncture the myth of the star fund manager, showing how stock-picking “skill” is often equivalent to throwing ten heads in a row.
This is all well and good. However, today’s financial landscape is different. Investors are increasingly appreciative of the “rent” extracted by over-rated fund managers, which explains the popularity of lower-cost products such as exchange-traded funds. These give rise to their own dangers of course, but that is outside the scope of Kahneman’s analysis.
In the context of the financial crisis, Taleb’s ideas have worn less well. I have written elsewhere about the vacuity of Taleb’s signature concept, “the black swan”, and I don’t think it does Kahneman credit to share Taleb’s derision for journalists who supposedly construct fictitious but convincing narratives around what are in reality random events. It can make one sound like a fellow traveller in an attack on free speech, for a start. And in the wake of the crisis, one might argue that randomness is as much of a fallacy as over-convincing narratives. Although events may seem random, that is often a function of the fact that explanations are hard (sometimes dangerous) to uncover.
Where do probabilities come from?
Probability lies at the core of Kahneman’s research over the years. Some of his most important papers amount to discoveries on how humans fail to perceive or use probabilities correctly, compared to the benchmarks of statistical inference. Knowing that probabilistic reasoning doesn’t come naturally to most people, Kahneman leaves the question of what probability actually is until 150 pages into the book.
Passing quickly over the philosophical or statistical definitions of the word, Kahneman makes the point that probability has an everyday meaning that doesn’t trouble his experimental subjects. They jump straight in when Kahneman asks them to estimate probabilities, and their flawed approach, he says, is the result of a “mental shotgun” that intuitively estimates probability using representativeness or stereotypes. It’s that System 1 again.
In his post-Gigerenzer synthesis, Kahneman takes care not to belittle representativeness. People use it for good reasons, he says. Mostly, people who act friendly are friendly. Tall, thin athletes are more likely to be basketball players than footballers and so on. It’s just those tricky situations where the weight of statistical evidence points the other way from the handy stereotype you want to use. Kahneman’s example is “Tom W”, the nerdy, order-loving student who you want to pigeonhole as a computer science student when most students study humanities (at least that was the case in the 1970s when the research was done; today computer science is more popular).
What does “most” mean? It’s what Kahneman calls a base rate, or what some people might call an unconditional probability—the ratio of the historical frequency of the thing you’re looking for versus the entire population. That dry, frequency-based probability has to be converted into something personal: your degree of belief that something is true.
Kahneman suggests using base rates (assuming you can get hold of them) to anchor guesses about probability which you then adjust using the evidence at hand. In essence, this amounts to a poor man’s version of Bayes theorem, which is the official statistical route to updating prior knowledge, such as base rates, in the light of new evidence. To help people prone to ignoring dry statistical information in favour of more vivid personal impressions, Kahneman proposes “stereotyping” base rates by embodying their properties in individuals.
Increasingly, artificial intelligence systems operate according to Bayesian precepts, crunching “big data” together with new evidence to predict what people will do. Bayesian estimation is starting to become democratised—you can already purchase smartphone apps, aimed at medical professionals, that do the math.
However, that doesn’t mean Kahneman’s interpretation of probability should be accepted uncritically. In most of his and Tversky’s experiments involving probability, the psychologist plays God: he stipulates from outside what the “real” probability is. Subjects then are measured against it.
In reality, there isn’t a correct probability out there, an objective truth that all experts should agree on. Aside from a few narrowly defined problems such as gambling odds, probability is itself uncertain, the outcome of a choice of model or judgement. How can ordinary peoples’ decision-making be judged against a benchmark that might not even have a value?
This is more than an abstract philosophical debate. Government agencies and the artificial intelligence systems they deploy utilize probability in an even more god-like way than Kahneman, making decisions about individuals using base rate analysis.
Such activity might or might not be reasonable; however, the idea that a base rate probability has a uniquely privileged legal status in questions of liberty must be challenged. Consider the interpretation of probability devised by Frank Ramsey. This precocious English philosopher, who was a friend of Wittgenstein and Keynes, wrote a paper on probability in 1926. Ramsey showed that probability could be derived purely from peoples’ feelings about uncertain bets they were prepared to make. Base rates may or may not validate these “partial beliefs” (and vice versa) but they do not outrank them.
It’s a quirk of history that Ramsey’s version of probability never reached the mainstream–three years after he wrote the paper in 1926, he died, still in his twenties. Over the next three decades, economists and psychologists coalesced around the expected utility framework developed by John von Neumann, Oskar Morgenstern and Leonard “Jimmie” Savage, which Kahneman and Tversky went on to question with their so-called Prospect Theory.
In a world dominated by expected utility, prospect theory reads like a breakthrough, but in the context of Ramsey’s conception of probability, it feels less original. And Ramsey’s approach based on personal preferences seems relevant in our democracies where state- or corporate-mediated definitions of probability play an ever-increasing role.
From logic to reality
One can critique Kahneman’s ideas on a more fundamental level by questioning one of his key logical assumptions. This assumption is invoked throughout Thinking, Fast and Slow, although Kahneman doesn’t acknowledge it explicitly. Two places in which it comes to the fore, however, are the famous “Linda Paradox” and the “experiencing self”.
The paradox focuses on Linda, a fictional character that Kahneman and Tversky devised in the early 1980s. Linda is described as young, outspoken, bright, going to antinuclear demonstrations and fired up about social justice. Experimental subjects were asked what was more probable: A) that Linda was a bank teller, and B) Linda was a bank teller who was active in the feminist movement. Most experimental subjects answered B).
However, if you treat Linda as being plucked from a set of bank tellers, then the set of feminist bank tellers must be smaller than the set of bank tellers in general, and hence less probable. In other words, the subjects who chose B) were making a logical error. Kahneman and Tversky called it the “conjunction fallacy”, and it became one of their most celebrated, and controversial discoveries.
Kahneman gives a thoughtful account of the Linda paradox and other similar experiments designed to test the conjunction fallacy. Taken together, these experiments provide convincing evidence that people trip up when asked to assess likelihood in purely statistical terms.
But is the world fully described by statistical relations? What I mean here is categorising everything into propositions and counting their occurrences within sets of more general phenomena, allowing experiments to be designed and scientific inferences to be made. Social sciences including psychology and economics couldn’t call themselves ‘sciences’ without assuming this.
For hard facts, like the incidence or fatality rate of a disease, there can be no doubt that statistics provides the window onto truth. In the social sciences, where data come in the shape of numbers, statistics can be dangerously alluring. In finance, where the methods of statistics led to the invention of value-at-risk, it’s clear today that VaR doesn’t come close to describing the risks that banks face. For softer forms of information, such as how people perceive themselves and each other, the claims of statistics are even more tenuous.
Think again about the Linda Paradox. Kahneman and Tversky called it the conjunction fallacy because in the jargon of logic the statement ‘Linda is a bank teller and a feminist’ is a conjunction of two propositions. In this statement, the word ‘and’ is a logical operator which carries directly over into probabilities. The fallacy rears its head again later in the book where Kahneman discusses ‘experienced utility’, or the sum of happiness over time, which experimental subjects give a lower value than their most recent memory of happiness.
However, the ‘and operator’ isn’t the same as the word ‘and’ in ordinary English. In his 1979 book Gödel, Escher, Bach, Douglas Hofstadter made this point in a Lewis Carroll-style dialogue between Achilles and a tortoise, where a valid conjunction of propositions in logical terms reads like an absurd self-contradiction in English. Could this tension lie at the heart of the Linda Paradox?
As Hofstadter explained, the structure of logic is richer than consisting of purely of sets of propositions that can be enumerated and analysed statistically. In particular, so-called predicate logic allows individual things or people to be described as having different qualities. Linda, the individual, can be festooned with qualities, and as more and more of them are draped on her like jewellery, some, such as ‘bank teller’ feel less plausible even when bundled together with the very plausible quantifier ‘feminist’.
This more powerful form of logic has its own limitations—it leads to Gödel’s theorem, according to which not all true statements can be proved. And it is resilient to the machinery of probability. So-called ‘first-order probabilistic logic’ remains an unsolved problem among mathematicians and computer scientists.
Even predicate logic fails to do justice to reality, with the quantifiers we attach to objects and people dynamically becoming objects themselves, and so on. Statistical inference based purely on propositions—and remember this underpins all of social science—can never uncover more than a tiny part of the truth about the world of people, language and meaning.
According to Stuart Russell, a University of California computer scientist who jointly wrote a classic artificial intelligence textbook with Google’s head of research Peter Norvig, “you could not write a grammar in propositional logic. In propositional logic there are no objects. It’s somewhat analogous to the difference between wiring up a circuit to turn lights on and writing software”.
Russell told me this eleven years ago, and some day I will publish my book and provide more detail on these fascinating debates about probability at the cutting edge of AI. Since I spoke to Russell, enormous hopes have been pinned on statistical inference as a basis for corporate valuations and political initiatives. Kahneman provides a masterful summary of how to judge human decision making against this benchmark. But as for the benchmark itself—there still remain more questions than answers.
1. Ramsey argued that the only fundamental quantity was the increase or decrease in happiness that a person assigned to future states of the world compared to their current state. People made choices between certainty (the default choice or benchmark) and uncertain bets by ranking the happiness or unhappiness of the possible outcomes. If their probability-weighted expectation of happiness exceeded the happiness from certainty, then people would choose the bet. Only at this crossover point, where it led to a decision to accept a bet, could probability be said to exist. At this point, probability could be mathematically derived from the change in happiness or unhappiness (compared to the benchmark, certainty) that an uncertain event produced; aside from this, probability has no independent existence at all.
(F.P. Ramsey, Truth and Probability, published in The Foundations of Mathematics, Routledge & Kegan Paul 1931)
2. Achilles: The idea is that since you believed sentence 1 (“My shell is green”), AND you believed sentence 2 (“My shell is not green”), you would believe one compound sentence in which both were combined, wouldn’t you?
Tortoise: Of course. It would only be reasonable…providing just that the manner of combination is universally acceptable.
(Douglas R. Hofstadter, Gödel, Escher, Bach: An Eternal Golden Braid, Basic Books Inc. 1979)