Saturday, October 17, 2015

17/10/15: Let’s talk about the Law of Small Numbers

Wonkishly awesome, folks…

Let’s start with a set up

You decide to will flip a coin 4 times in a row and record the outcome of each flip. After you done flipping, you look at every flip that “immediately followed an outcome of heads, and compute the relative frequency of heads on those flips”.

“Because the coin is fair, [you] of course expect this empirical probability of heads to be equal to the true probability of flipping a heads: 0.5.”

You will be wrong. If you “were to sample one million fair coins and flip each coin 4 times, observing the conditional relative frequency for each coin, on average the relative frequency would be approximately 0.4.”

Two researchers, Joshua Miller and Adam Sanjurjo “demonstrate that in a finite sequence generated by i.i.d. [independent, identically distributed] Bernoulli trials with probability of success p, the relative frequency of success on those trials that immediately follow a streak of one, or more, consecutive successes is expected to be strictly less than p, i.e. the empirical probability of success on such trials is a biased estimator of the true conditional probability of success.”

Which implies

So far, pretty innocuous from the average punter perspective. But wait. “While, in general, the bias does decrease as the sequence gets longer, for a range of sequence (and streak) lengths often used in empirical work it remains substantial, and increases in streak length.” In other words, while empirical probability does approach closer and closer to true conditional probability, it does so in trials so large (so many coins flips) that such convergence does not make much of the difference in our, human, decision making.

And that is pretty pesky for the way we look at probabilistic outcomes and make decisions based on our expectations, whenever our decisions are sequential.

Impact on decision making

“This result has considerable implications for the study of decision making in any environment that involves sequential data”. These implication are:

  1. This provides “a structural explanation for the persistence of one of the most well-documented, and robust, systematic errors in beliefs regarding sequential data—that people have an alternation bias (also known as negative recency bias)… — by which they believe, for example, that when observing multiple flips of a fair coin, an outcome of heads is more likely to be followed by a tails than by another heads;
  2. It also helps resolve “…the closely related gambler’s fallacy…, in which this alternation bias increases with the length of the streak of heads.”
  3. “Further, the result shows that data in the hot hand fallacy literature …has been systematically misinterpreted by researchers; for those trials that immediately follow a streak of successes, observing that the relative frequency of success is equal to the overall base rate of success, is in fact evidence in favor of the hot hand, rather than evidence against it.”

And tangible applications are

So the realisation that “the empirical probability of success on such trials is a biased estimator of the true conditional probability of success” helps explain why “…the inability of the gambler to detect the fallacy of his belief in alternation has an exact parallel with the researcher’s inability to detect his mistake when concluding that experts’ belief in the hot hand is a fallacy.”

But there is more. Per authors, “the result may have implications for evaluation and compensation systems. That a coin is expected to exhibit an alternation “bias” in finite sequences implies that the outcome of a flip can be successfully “predicted” in finite sequences at a rate better than that of chance (if one is free to choose when to predict).”

They offer the following example of this: “suppose that each day a stock index goes either up or down, according to a random walk in which the probability of going up is, say, 0.6. A financial analyst who can predict the next day’s performance on the days she chooses to, and whose predictions are evaluated in terms of how her success rate on predictions in a given month compares to that of chance, can expect to outperform this benchmark… For instance, she can simply predict “up” immediately following down days, or increase her expected relative performance even further by predicting “up” only immediately following longer streaks of consecutive down days.”

Going back to the first example with coin flipping, the law of large numbers implies that as your sampling size (number of coin flips) rises, “…the average empirical probability of heads would approach the true probability. The key to why this is not the case, and to why the bias remains, is that it is not the flip that is treated as the unit of analysis, but rather the sequence of flips from each coin. In particular, if [you] were willing to assume that each sequence had been generated by the same coin, and [you] were to compute the empirical probability by instead pooling together all of those flips that immediately follow a heads, regardless of which coin produced them, then the bias would converge to zero as the number of coins approaches infinity.”

What this means is that “…in treating the sequence as the unit of analysis, the average empirical probability across coins amounts to an unweighted average that does not account for the number of flips that immediately follow a heads in each sequence, and thus leads the data to appear consistent with the gambler’s fallacy.”

Per authors, “the implications for learning are stark: to the extent that decision makers update their beliefs regarding sequential dependence with the (unweighted) empirical probabilities that they observe in finite length sequences, they can never unlearn a belief in the gambler’s fallacy…”

Overall, we have

To sum this up, the authors found “a subtle but substantial bias in a standard measure of the conditional dependence of present outcomes on streaks of past outcomes… The mechanism is a form of selection bias, which leads the empirical probability …to underestimate the true probability of a given outcome, when conditioning on prior outcomes of the same kind. The biased measure has been used prominently in the literature that investigates incorrect beliefs in sequential decision making --- most notably the Gambler's Fallacy and the Hot Hand Fallacy.”

The two fallacies are defined as follows:

  • “…People believe outcomes alternate more than they actually do, e.g. for a fair coin, after observing a flip of a tails, people believe that the next flip is more likely to produce a heads than a tails. Further, as a streak of identical outcomes increases in length, people also tend to think that the alternation rate on the outcome that follows becomes even larger, which is known as the gambler’s fallacy”.
  • “The hot hand fallacy typically refers to the mistaken belief that success tends to follow success (hot hand), when in fact observed successes are consistent with the typical fluctuations of a chance process.”

After correcting for the bias, the authors show that “the conclusions of some prominent studies in the literature are reversed.” Awesomely wonkish...

Full paper: Miller, Joshua Benjamin and Sanjurjo, Adam, Surprised by the Gambler's and Hot Hand Fallacies? A Truth in the Law of Small Numbers (September 15, 2015). IGIER Working Paper #552.

No comments: