Much of our thinking is flawed because it is based on faulty intuition. But by using the framework and tools of probability and statistics, we can overcome this to provide solutions to many real-world problems and paradoxes. Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

When it comes to situations like waiting for a bus, our intuition is often wrong.
Imagine, there’s a bus that arrives every 30 minutes on average and you arrive at the bus stop with no idea when the last bus left. How long can you expect to wait for the next bus? Intuitively, half of 30 minutes sounds right, but you’d be very lucky to wait only 15 minutes.
Say, for example, that half the time the buses arrive at a 20-minute interval and half the time at a 40-minute interval. The overall average is now 30 minutes. From your point of view, however, it is twice as likely that you’ll turn up during the 40 minutes interval than during the 20 minutes interval.
This is true in every case except when the buses arrive at exact 30-minute intervals. As the dispersion around the average increases, so does the amount by which the expected wait time exceeds the average wait. This is the Inspection Paradox, which states that whenever you “inspect” a process, you are likely to find that things take (or last) longer than their “uninspected” average. What seems like the persistence of bad luck is simply the laws of probability and statistics playing out their natural course.
Once made aware of the paradox, it seems to appear all over the place.
For example, let’s say you want to take a survey of the average class size at a college. Say that the college has class sizes of either 10 or 50, and there are equal numbers of each. So the overall average class size is 30. But in selecting a random student, it is five times more likely that he or she will come from a class of 50 students than of 10 students. So for every one student who replies “10” to your enquiry about their class size, there will be five who answer “50”. The average class size thrown up by your survey is nearer 50, therefore, than 30. So the act of inspecting the class sizes significantly increases the average obtained compared to the true, uninspected average. The only circumstance in which the inspected and uninspected average coincides is when every class size is equal.
We can examine the same paradox within the context of what is known as length-based sampling. For example, when digging up potatoes, why does the fork go through the very large one? Why does the network connection break down during download of the largest file? It is not because you were born unlucky but because these outcomes occur for a greater extension of space or time than the average extension of space or time.
Once you know about the Inspection Paradox, the world and our perception of our place in it are never quite the same again.
Another day you line up at the medical practice to be tested for a virus. The test is 99% accurate and you test positive. Now, what is the chance that you have the virus? The intuitive answer is 99%. But is that right? The information we are given relates to the probability of testing positive given that you have the virus. What we want to know, however, is the probability of having the virus given that you test positive. Common intuition conflates these two probabilities, but they are very different. This is an instance of the Inverse or Prosecutor’s Fallacy.
The significance of the test result depends on the probability that you have the virus before taking the test. This is known as the prior probability. Essentially, we have a competition between how rare the virus is (the base rate) and how rarely the test is wrong. Let’s say there is a 1 in 100 chance, based on local prevalence rates, that you have the virus before taking the test. Now, recall that the test is wrong one time in 100. These two probabilities are equal, so the chance that you have the virus when testing positive is 1 in 2, despite the test being 99% accurate. But what if you are showing symptoms of the virus before being tested? In this case, we should update the prior probability to something higher than the prevalence rate in the tested population. The chance you have the virus when you test positive rises accordingly. We can use Bayes’ Theorem to perform the calculations.
In summary, intuition often lets us down. Still, by applying the methods of probability and statistics, we can defy intuition. We can even resolve what might seem to many the greatest mystery of them all – why we seem so often to find ourselves stuck in the slower lane or queue. Intuitively, we were born unlucky. The logical answer to the Slower Lane Puzzle is that it’s exactly where we should expect to be!
When intuition fails, we can always use probability and statistics to look for the real answers.
Leighton Vaughan Williams, Professor of Economics and Finance at Nottingham Business School. Read more in Leighton’s new publication Probability, Choice and Reason.
Lessons from a Beauty Contest
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman and Hall/CRC Press, 2024).
THE NUMBER DILEMMA
In the Number Dilemma, participants must choose a whole number between 0 and 100, aiming to get closest to two-thirds of the average number chosen by all participants. This scenario tests not only numerical reasoning but also understanding of human behaviour.
LEVEL 1 RATIONALITY: CHALLENGING THE AVERAGE
If you were to assume that the other participants would choose a random number within the given range, the average number chosen by everyone would be 50. Under this assumption, you might believe that choosing 33, the nearest integer to two-thirds of 50, would provide a high probability of winning. This initial strategy, known as Level 1 rationality, might appear intuitively logical.
LEVEL 2 RATIONALITY: ANTICIPATING THE AVERAGE OF THE AVERAGE
However, upon closer inspection, a new insight emerges. Since you reasoned that choosing 33 was a smart move, it is reasonable to assume that other participants will arrive at the same conclusion. Consequently, the average number chosen by all participants would shift towards 33. To maximise your chances of winning, you decide to adopt Level 2 rationality and choose a number lower than 33. In this case, 22 appears to be the optimal choice.
LEVEL 3 RATIONALITY: GOING DEEPER INTO ANTICIPATION
As you delve deeper into the rationality levels, a pattern begins to emerge. Just as you contemplated that others might select 22, they too will likely adopt the same line of reasoning. To outsmart them, you employ Level 3 rationality and opt for the number 15. The idea is to anticipate the choices of others and select a number that is two-thirds of the average they might choose.
LEVELS OF RATIONALITY
In summary, the levels of rationality illustrate the iterative process of outthinking others.
APPROACHING ZERO: THE ULTIMATE RATIONAL CHOICE
As you progress through each level of rationality, however, you cannot help but notice a concerning trend. As rationality levels increase, choices converge towards zero, posing a paradox: Is zero really the most rational choice when considering human diversity in decision-making? Deep down, you begin to question the effectiveness of choosing zero.
ACCOUNTING FOR DIFFERENT LEVELS OF RATIONALITY
Your uncertainty arises from the realisation that not all participants are likely to think or reason in the same way. Variations in human rationality mean that some may choose randomly or with less strategic depth, affecting the overall average and optimal choice.
THE WINNING NUMBER
In a practical application of this dilemma involving Financial Times readers, the winning number was 13, showcasing the unpredictability of collective rationality.
KEYNESIAN BEAUTY CONTEST IN FINANCIAL MARKETS
The economist John Maynard Keynes encapsulated the essence of the dilemma in his work, ‘General Theory of Employment, Interest, and Money’. He likened professional investment to a newspaper competition where participants must select the prettiest faces from a selection of photographs. The prize is awarded to the competitor whose choices align most closely with the average choice of all participants.
OUTGUESSING THE BEST GUESSES
Keynes emphasised that participants should not merely choose what they believe to be the prettiest faces according to their own judgment or average public opinion. Instead, they should anticipate what average opinion expects the average opinion to be. In essence, winning the competition relies on outguessing the best guesses of others—a strategy referred to as super-rationality. Just as in the Number Dilemma, the Keynesian Beauty Contest involves predicting others’ predictions, a strategy crucial in financial markets and investment decisions.
DISCOVERING THE HIDDEN OPPORTUNITIES
In the context of this so-called Keynesian Beauty Contest, the concept of super-rationality holds tremendous significance. This strategy involves outthinking the crowd’s average opinion, a concept that can reveal overlooked opportunities in various contexts. By transcending the common line of reasoning and adopting a super-rational approach, individuals can unveil hidden possibilities and potentially reap rewards. While these concepts offer intriguing insights, their practical application is complex due to the unpredictable nature of human decision-making and diverse levels of rationality.
CONCLUSION: EMBRACING SUPER-RATIONALITY
The Keynesian Beauty Contest serves as a captivating thought experiment that challenges traditional notions of rational decision-making. It showcases the complexities of human behaviour and highlights the importance of anticipating the actions of others. By embracing the concept of super-rationality and outguessing the best guesses of the crowd, individuals can navigate these intricacies and increase their chances of success.
Exploring the Bad Luck Syndrome
THE SLOWER LINE PARADOX
Is the line next to you at the airport check-in or the supermarket check-out always quicker than the one you are in? Is the traffic in the neighbouring lane in heavy traffic always moving a bit more quickly than your lane? We’ve all experienced it. Or does it just seem that way?
THE ILLUSION OF THE SLOWER LINE
One explanation for the perception of always being in the slower line or lane can be attributed to basic human psychology. Our tendency to notice and remember the times when we’re left behind, while quickly forgetting the moments we overtake others, may play a role in this feeling. Or might it be an illusion caused by our tendency to glance over at the neighbouring option more often when we are progressing slowly rather than quickly? Additionally, our focus tends to be more forward-looking, so when driving, for example, vehicles we overtake quickly fade from our memory while those remaining in front continue to torment us.
The question then arises: Is this perception all an illusion or is there a real and fundamental phenomenon at play? Philosopher Nick Bostrom suggests that the effect is real and is the consequence of an observer selection effect. It is not just a trick of the mind.
THE SELECTION EFFECT
To understand why we might frequently find ourselves in the slower lane, let’s consider an example of fish in a pond. If we catch sixty fish, all of which are more than six inches long, does this evidence support the hypothesis that all the fish in the pond are longer than six inches?
The answer depends on whether our net is capable of catching fish smaller than six inches. If the holes in the net allow smaller fish to pass through, our sample of fish would be biased towards the larger ones. This is known as a selection effect or an observation bias.
Now, just as a fisherman’s net biased towards larger fish can misrepresent the pond’s population, our position in a slower lane biases our perception of overall speed and flow.
RANDOMLY SELECTED OBSERVERS
When considering whether we are more often in the slower of two lines at the supermarket checkout, it is crucial to ask: ‘For a randomly selected person, are the people in the next line actually progressing faster?’ We need to view ourselves as random observers and think about the implications of this perspective for our observations.
An apparent reason why we might find ourselves driving in a slower moving lane after choosing one of two apparently equal options is the greater number of vehicles in the slower lane compared to the neighbouring lane. Cars travelling at higher speeds are generally more spread out than slower cars, so a given stretch of road is likely to have more cars in the slower lane. Consequently, the average driver will spend more time in the slower lane or lanes. This phenomenon is known as an observer selection effect, where observers should reason as if they were randomly selected from the entire set of observers.
THE VIEWPOINT OF THE MAJORITY
To put it simply, if we perceive our present observation as a random sample from all observations made by all relevant observers, the probability is that our observation will align with the perspective of most drivers, and these are typically in the slower-moving lane. Because of this observer effect, a randomly selected driver will not only seem to be in the slower lane, but will actually be in the slower lane.
In other words, when we view ourselves as part of a larger group of observers, we realise that being in a slower lane or line is more than perception; it’s a statistical likelihood.
For instance, if there are 20 observers in the slower lane and 10 in the equivalent section of the other faster lane, there is a 2/3 chance that we are in the slower lane.
CONCLUSION: EMBRACING THE REALITY
So whenever we think that the other lane or line is faster, we should be aware that it very probably is. Our perception aligns with the reality that the slower line tends to contain more observers, leading to a higher likelihood of finding ourselves in it.
Understanding the Slower Line Paradox isn’t just about traffic or queues, though. It’s a lesson in perspective and probability, reminding us that our individual experiences often reflect broader regularities. So it’s not bad luck, after all, but sound statistics. Embracing this reality should make us feel a whole lot better! Until the next time it happens to us.
Exploring the Doomsday Argument
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman and Hall/CRC Press, 2024).
CONTEMPLATING OUR EXISTENTIAL PREDICAMENT
Thanks for reading Twisted Logic! Subscribe for free to receive new posts and support my work.
The Doomsday Argument is a statistical and philosophical approach predicting humanity’s potential end. It uses principles of probability to suggest that humanity might be closer to its demise than we commonly believe.
PROBABILITY AND ITS IMPLICATIONS
Imagine attempting to estimate your enemy’s tank count. The tanks are sequentially manufactured, starting from one. You uncover serial numbers on five random tanks, all being under 10. In such a scenario, an intuitive grasp of probability would lead you to believe that your enemy doesn’t possess a large number of tanks. However, if you stumble upon serial numbers stretching into the thousands, your estimate would justifiably swing towards a much larger count.
In another scenario, consider a box filled with numbered balls, which can either contain ten balls (numbered 1–10) or ten thousand balls (numbered 1–10,000). If a ball drawn from the box reveals a single-digit number, such as seven, it is reasonable to assume that the box is much more likely to contain ten balls than ten thousand.
INVOKING THE MEDIOCRITY PRINCIPLE AND COPERNICAN PRINCIPLE
The tank and numbered balls examples tie closely to the concept of mediocrity, as captured in the ‘mediocrity principle’. This principle suggests that initial assumptions should lean towards mediocrity rather than the exceptional. In other words, we are more likely to encounter ordinary circumstances rather than extraordinary ones.
The Copernican principle dovetails with the mediocrity principle. It argues that we are not privileged or exceptional observers of the universe. This principle is rooted in Nicolaus Copernicus’s 16th-century finding that Earth does not occupy a central, special position in the universe.
GOTT’S WALL PREDICTION
Astrophysicist John Richard Gott took the Copernican principle to heart during his visit to the Berlin Wall in 1969. Lacking specific knowledge about the Wall’s expected lifespan, Gott took the position that his encounter with the Wall did not occur at any special time in its existence.
This assumption allowed him to estimate the future lifespan of the Wall. If, for instance, his visit was precisely halfway through its life, the Wall would stand for another eight years. If he visited one-quarter into its life, the Wall would stand for another 24 years. If visiting it three-quarters along its timeline, the future would be one-third of its past. Because half of its existence is between these two points (75% minus 25% is 50%), there was a 50% chance that it would last a further period between one-third and three times its current existence. Based on its age when he observed it in 1969 (eight years), Gott argued that there was a 50% chance that it would fall in between 8/3 years (2 years, 8 months) and 8 × 3 (24) years from then.
The Berlin Wall fell in 1989, 20 years after Gott’s visit and roughly 28 years after it was built. This bolstered Gott’s confidence in the Copernican-based method of making predictions, which he termed ‘Copernican time horizons’.
The implications of Gott’s Wall are far-reaching. They suggest that we could potentially apply the Copernican principle to make predictions about other systems where we have little information about their total lifespan. For example, it could be applied to predict the lifespan of a company, the duration of a war, or the longevity of a species, among many other things.
However, it’s essential to acknowledge the limitations of this method. It is predicated on the assumption that there is nothing special about the moment of observation, an assumption that may not hold true in many scenarios. Despite these limitations, Gott’s approach represents a fascinating application of the Copernican principle to real-world events, demonstrating how our position in time, just as in space, can be used to gain insights about the world around us.
THE LINDY EFFECT AND ITS LIMITATIONS
Gott’s method finds resonance with the ‘Lindy effect’, the name of which is derived from a New York delicatessen, famous for its cheesecakes, which was frequented by actors playing in Broadway shows. It suggests that a show that had been running for three years could be expected on average to last for about another three years.
However, the Lindy effect has limitations. It breaks down when applied to processes like biological ageing. For instance, a human who has lived for 100 years is very unlikely indeed to live another 100 years. The factors influencing human lifespan are far from random, rendering the Lindy effect ineffective for such predictions.
FROM COPERNICAN PRINCIPLE TO DOOMSDAY ARGUMENT
The Doomsday Argument employs Gott’s idea to estimate the Doomsday date for the human race. Applied to humanity, the argument contends that if we consider humanity’s entire history, we should statistically find ourselves somewhere around the middle of that history in terms of the human population. If our population continues to grow exponentially, this suggests that humanity has a relatively short lifespan left, potentially within this millennium.
ESTIMATES AND PROJECTIONS
This projection takes into account the fact that there have been approximately 110 billion humans on earth to date, 7% of whom are alive today. Following demographic trends forward and estimating how long it will be for a further 110 billion humans to be born, the Doomsday Argument anticipates humanity’s timeline is likely to end well within this millennium.
DEBATE AND CRITICISMS
The Doomsday Argument is not without its critics. Some argue that humanity will never go extinct, while others highlight that the argument’s assumptions might not hold true, such as the assumption that humans are at the midpoint of our existence timeline. Others claim that the argument fails to account for future scientific and technological developments that might significantly extend, or perhaps foreshorten, humanity’s lifespan.
CONCLUSION: THE FATE OF HUMANITY?
The Doomsday Argument provides a thought-provoking perspective on humanity’s potential fate. It integrates probability, statistics, and philosophical principles, offering a statistical guess at our collective demise. While it is far from conclusive, it is certainly important in serving as a reminder of our finite earthly existence and the urgency to address the global threats that could precipitate our doom. Whatever else, the debate around the argument and our ultimate fate as humans will persist, sparking further exploration into this fascinating intersection of probability, philosophy, and existential prediction.
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman & Hall/CRC Press, 2024).
WHEN TO STOP LOOKING AND START CHOOSING
The ‘Secretary Problem’ is a classic scenario in decision-making and probability theory. The ‘Optimal Stopping Problem’ or ‘Secretary Problem’, as it is often called, offers insights into the dilemma of when to stop looking and start choosing. Whether it is finding the right partner, hiring the best assistant, or identifying the ideal place to live, this mathematical problem delivers a powerful solution.
CHOOSING A CAR
Let’s say that you have 20 used cars to choose from, offered to you in a random sequence. You have three minutes to evaluate each. Once you turn one down, there is no returning to it, such is the speed of turnover, but the silver lining is that you are guaranteed the sale of any vehicle you do select. If you come to the end of the line, you must accept whatever remains, even if it happens to be the least desirable. Your decision is guided solely by the relative merits of the vehicles on offer.
BALANCING BETWEEN TOO EARLY AND TOO LATE
There are two significant failures in your quest to find the best vehicle for you—stopping too early and stopping too late. If you stop too early, you might miss out on a better option. Conversely, if you stop too late, you risk passing over the best option while waiting for a better option that might not exist. So, how do you find the right balance?
INTRODUCING THE OPTIMAL STOPPING STRATEGY
Do you have a strategy that is better than random selection?
The Optimal Stopping Problem provides a solution. If there are three cars in the flash sale, optimal stopping strategy suggests rejecting the first option in order to gain more information about the relative merits of those available. If the second option turns out to be worse, you should wait, despite the risk of ending up with the third, which could potentially be the worst of the three. However, if the second option is better, you should accept it immediately, foregoing the possibility that the third might be a better match.
EXTENDING THE STRATEGY: FROM 4 TO 100
With four options, you should reject the first. Again, if the second is better than the first, take that. If not, and the third is better, take that. Otherwise, you must take the fourth and hope for the best. With a hundred options, you should inspect the first 37 and then choose the first after that which is better than the best of the first 37.
This strategy, often referred to as the 37% Rule, is based on the mathematical constant, e (Euler’s number). The value of 1/e is approximately 0.36788% or 36.788%, which rounds up to 37%. Following this rule, you have a 37% chance of finding the best car by employing this strategy.
THE GROUNDWORK
When faced with a choice of n candidates for a job, the challenge lies in deciding when to stop the process of rejection and start the process of selection. The mathematical answer to this, as highlighted before, is to reject the first n/e candidates, where ‘e’ is the base of natural logarithms, approximately 2.7. So, if there are 100 choices, n/e becomes 100/2.7, which is about 37. This strategy effectively breaks the selection process into two phases: the assessment phase and the selection phase.
The resulting principle is, therefore, surprisingly straightforward: reject the first 37% of candidates to gather information about the quality of the pool, then select the next candidate who is better than anyone seen so far.
REAL-WORLD APPLICATIONS
While the Secretary Problem is a simplified and somewhat idealised situation, the 37% Rule can have valuable applications in real-world scenarios:
Job Hiring: Hiring managers can use the 37% rule as a strategic guideline during the candidate evaluation phase.
Home Hunting: The principle is also applicable as a heuristic when looking for a home to buy or rent, especially in a fast-moving msrket.
Online Shopping: This principle can also be useful when shopping online to streamline purchasing decisions. By reviewing a certain portion of available options before making a selection, shoppers can reduce the overwhelming array of choices and enhance their overall shopping efficiency and satisfaction.
CRITIQUES AND LIMITATIONS
While the 37% Rule provides a theoretically optimal solution to the Secretary Problem, it does have certain limitations:
Idealised Assumptions: The problem assumes that options are presented one at a time, in random order, and once rejected, they cannot be recalled.
Risk of Missing Out: Following the 37% Rule means you run the risk of the best option being rejected during the assessment phase.
Difficult to Determine the Total Pool: The problem assumes you know the total number of options upfront.
Emotional Considerations: The rule neglects emotional considerations, personal intuition, and human subjectivity.
ADAPTING THE RULE FOR UNCERTAINTY
The rule can be adapted if there is a probability that your selection of a range of options might opt out or be withdrawn. For example, if there is a 50% chance that the selection might opt out or be withdrawn after selecting it, then the 37% rule can be converted into a 25% rule, reflecting the added uncertainty. There is also a rule-of-thumb for when the aim is to select a good option, if not necessarily the best. Out of 100 options, for example, the square root rule suggests seeing the first ten (the square root of 100) and then selecting the first option of those remaining that is better than the best of those ten.
CONCLUSION: EXPLORATION AND EXPLOITATION
The Secretary Problem teaches us about the balance between exploration (gathering information) and exploitation (making a decision), offering a structured approach to navigating complex choices. Despite limitations, the Secretary Problem and the 37% Rule offer valuable insights into these trade-offs. It also provides a mathematically grounded approach to making complex decisions.
Exploring the Expected Value Paradox
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman and Hall/CRC Press, 2024).
UNDERSTANDING THE EXPECTED VALUE PARADOX
At its core, the Expected Value (EV) Paradox invites us to examine how outcomes deviate when we analyse them through the lens of a single ensemble event (a large group participating in an event once) vs. a multiple time event (a single individual participating in the event multiple times).
Take the example of a hypothetical coin-tossing game where players gain 50% of their bet if the coin lands on Heads and lose 40% if it lands on Tails. This game seems favourable for the player—the game has what is termed a positive expected value.
However, the paradox arises when the concept of time is introduced into the equation. While the game appears favourable in theory, it could lead to a net loss for an individual playing this game multiple times. As the coin is tossed more and more, the individual’s wealth may diminish over time, leading to a scenario where they lose all their money, even though the theoretical gain from playing the game is positive.
THE EXPERIMENT
Let’s set up an experiment involving a coin-tossing game with 100 participants, each with an initial stake of £10, to illustrate the difference. In this scenario, we’re employing what’s known as an ensemble perspective, where we’re examining a large group participating in an event once.
Statistically, given a fair coin, we would expect roughly half of the coin tosses to land on Heads and half on Tails. Therefore, of the 100 people, we predict that around 50 people will toss Heads and 50 will toss Tails.
If the coin lands on Heads, each of the 50 players stands to gain 50% of their stake, which is £5. In total, this translates to a combined gain of £250 (50 players × £5).
On the other side, if the coin lands on Tails, each of the remaining 50 players loses 40% of their stake, which is £4. This accumulates to a total loss of £200 (50 players × £4).
Subtracting the total loss from the total gain (£250 – £200), we find a net gain of £50 over all 100 players. When we average this out over the number of players, we see an average net gain of £0.5 (50 pence) per player (£50 ÷ 100 players), or 5% of the £10 initial stake.
THE PARADOX
The Expected Value Paradox becomes evident when we shift from an ensemble perspective, involving many people playing the game once, to a time perspective, involving one person playing the game multiple times.
Let’s examine a scenario where a single player engages in four rounds of the game, starting with a stake of £10. For simplicity’s sake, we’ll assume an equal chance of landing Heads or Tails—therefore expecting two Heads and two Tails.
When the coin lands on Heads in the first round, the player gains 50% of their stake, increasing their wealth to £15 (£10 + 50% of £10). If the coin lands on Heads again in the second round, their wealth grows to £22.50 (£15 + 50% of £15).
However, the game changes when the coin lands on Tails in the third round. The player loses 40% of their current wealth, reducing it to £13.50 (£22.50 minus 40% of £22.50). If the coin lands on Tails again in the fourth round, the player’s wealth decreases further to £8.10 (£13.50 − 40% of £13.50).
Despite starting the game with a positive expected value, the player ends up with less money than they started with. Even though the probabilities haven’t changed, the effects of winning and losing aren’t symmetric.
Thus, the Expected Value Paradox is clear in this example. When many people play the game once (ensemble averaging), the average return is positive, aligning with the expected value. However, when a single person here plays the game multiple times (time averaging), the player loses money.
TIME AVERAGING AND ENSEMBLE AVERAGING
In understanding the Expected Value Paradox, we are introduced to two different types of averaging: ‘time averaging’ and ‘ensemble averaging’.
TIME AVERAGING
‘Time averaging’ is a concept that comes into play when we are observing a single entity or process over an extended period. In the context of our coin-tossing game, time averaging refers to tracking the wealth of a single player as they participate in multiple rounds of the game. Over time, this player’s wealth fluctuates, often resulting in an overall loss despite the odds being in their favour. A severe loss (like bankruptcy) at any point can end the game for the player.
In our coin-tossing game, this would be akin to observing 100 players tossing the coin once. The overall gain camouflages the individual experiences, which can significantly vary—some players win, some lose.
ENSEMBLE AVERAGING
The ensemble average gives us a snapshot of the behaviour of many at a specific moment in time. The ‘ensemble probability’ refers to a large group’s collective experiences over a fixed period.
TIME VS. ENSEMBLE AVERAGING
This difference between ‘time probability’ and ‘ensemble probability’ underscores that a group’s average experience does not accurately predict an individual’s experience over time.
Understanding the distinction between these two types of averaging is crucial when interpreting outcomes of games, experiments, or any process involving randomness and repetition over time. This differentiation becomes especially important in fields like economics and finance, where these principles can guide strategy and risk management.
Strategies that work on an ensemble basis may not be effective (or could be disastrous) when applied over time by an individual—a paradox manifested clearly in our coin-tossing game.
SURVIVORSHIP AND WEALTH TRANSFER
Survivorship and wealth transfer are key elements in understanding how wealth moves around in situations like gambling and investing. The term ‘survivors’ refers to those who keep playing the game through various rounds, while ‘non-survivors’ are the ones who quit, or are pushed out, often because they’ve lost most or all of their money.
The idea is that the wealth lost by non-survivors doesn’t disappear. Instead, it gets transferred to the survivors, redistributing wealth within the system. Take a coin-tossing game as an example: if half of the 100 players lose everything and leave, while the other half double their initial amount, the group seems to break even. But, half of the players have nothing, while the other half have doubled their money.
CONCLUSION: THE INDIVIDUAL AND THE GROUP
In the conventional, or ensemble, view of probability, we look at the outcomes of many trials of an event and calculate averages. Some will win, some will lose, but overall the average outcome should reflect the true odds of the game. The individual variations or ‘paths’ of each person aren’t considered—we’re only interested in the average outcome. This so-called ensemble perspective is often used in classical statistics and probability theory. In contrast, the path-dependent view recognises that the order of events matters.
Take a person who plays a game 100 times. Even if the odds of each game are in their favour, they could still lose all their money if they have a run of bad luck. In this case, looking at the overall or ensemble average wouldn’t accurately reflect the individual’s experience.
In summary, while the ensemble view can provide a broad understanding of expected outcomes, the path-dependent view provides a more nuanced understanding of individual experiences.
The Martingale Betting Strategy
A version of this article appears in my book, “Twisted Logic: Puzzles, Paradoxes, and Big Questions” (Chapman and Hall/CRC Press, 2024).
The Martingale betting strategy is based on the principle of chasing losses through progressive increase in bet size. To illustrate this strategy, let’s consider an example: A gambler starts with a £2 bet on Heads, with an even money pay-out. If the coin lands Heads, the gambler wins £2, and if it lands Tails, they lose £2.
In the event of a loss, the Martingale strategy dictates that the next bet should be doubled (£4). The objective is to recover the previous losses and achieve a net profit equal to the initial stake (£2). This doubling process continues until a win is obtained. For instance, if Tails appears again, resulting in a cumulative loss of £6, the next bet would be £8. If a subsequent Heads occurs, the gambler would win £8, and after subtracting the previous losses (£6), they would be left with a net profit of £2. This pattern can be extended to any number of bets, with the net profit always equal to the initial stake (£2) whenever a win occurs.
CHASING LOSSES AND THE LIMITATIONS
While the Martingale strategy may appear promising in theory, it is important to recognise its limitations and the inherent risks involved. The strategy involves chasing losses in the hope of recovering them and generating a profit. However, it’s crucial to understand that the expected value of the strategy remains zero or even negative.
The main reason behind this lies in the presence of a small probability of incurring a significant loss. In a game with a house edge, such as in a casino, the odds contain an edge against the player. The house edge ensures that, over time, the expected value of the bets is negative. Therefore, even with the Martingale strategy, which aims to recover losses, the expected value of the bets remains unfavourable.
Moreover, in a casino setting, there are structural limitations that impede the effectiveness of the Martingale strategy. Most casinos impose limits on bet size. These limits prevent gamblers from doubling their bets indefinitely, even if they have boundless resources and time, thereby constraining the strategy’s potential for recovery.
THE DEVIL’S SHOOTING ROOM PARADOX
A parallel thought experiment known as the Devil’s Shooting Room Paradox adds an intriguing twist. In this scenario, a group of people enters a room where the Devil threatens to shoot everyone if he rolls a double-six. The Devil further states that over 90% of those who enter the room will be shot. Paradoxically, both statements can be true. Although the chance of any particular group being shot is only 1 in 36, the size of each subsequent group in this thought experiment is over ten times larger than the previous one. Thus, when considering the cumulative probability of being shot across multiple groups, it surpasses 90%.
Essentially, the Devil’s ability to continually usher in larger groups, each with a small probability of being shot, ultimately results in the majority of all the people entering the room being shot.
A key assumption underlying the Devil’s Shooting Room Paradox is the existence of an infinite supply of people. This assumption aligns with the concept of infinite wealth and resources often associated with Martingale-related paradoxes. Without a boundless supply of individuals to fill the room, the cumulative probability of over 90% cannot be definitively achieved.
The Devil’s Shooting Room Paradox serves in this way as another illustration of how probabilities and cumulative effects can lead to counterintuitive outcomes.
CONCLUSION: THE LIMITS OF A MARTINGALE STRATEGY
The Martingale strategy is based on chasing losses, but its expected value remains zero or negative due to the house edge. The strategy’s viability is further diminished by limitations on bet size in real-world casino scenarios. As such, the Martingale system cannot be considered a winning strategy in practical gambling situations. The Devil’s Shooting Room Paradox further demonstrates the complexities and counterintuitive outcomes that can arise when infinite numbers are assumed. Ultimately, a comprehensive understanding of these paradoxes provides valuable insights into the rationality of betting strategies and decision-making in the realm of gambling.
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman and Hall/CRC Press, 2024).
UNDERSTANDING THE CHEVALIER’S DICE PROBLEM
Probability is the science of uncertainty, providing a way to measure the likelihood of events occurring. It can be viewed as a measure of relative frequency or as a degree of belief. In the context of gambling, understanding probability is crucial for making informed decisions and avoiding common pitfalls.
A famous problem, known as the Chevalier’s Dice Problem, sheds light on the some of the intricacies of probability.
To understand the problem, it is essential to grasp some fundamental concepts of probability. Consider a single die roll—each outcome represents a possible event, such as rolling a 1, 2, 3, 4, 5, or 6. When rolling two dice, there are 36 possible outcomes (six outcomes for the first die multiplied by six outcomes for the second die).
THE FLAWED REASONING OF THE CHEVALIER
The Chevalier’s Dice Problem originated from a gambling challenge offered by the Chevalier de Méré, a 17th-century French gambler. The Chevalier offered even money odds that he could roll at least one six in four rolls of a fair die.
The Chevalier’s reasoning was based on the assumption that since the chance of rolling a six in a single die roll is 1/6, the probability of rolling a six in four rolls would be 4/6 or 2/3. However, this reasoning can be shown to lead to inconsistent results when extrapolated to more rolls.
The correct approach involves considering the independent nature of each throw of the die. The probability of a six in one go is 1/6, so the probability of not getting a six on that go is 5/6. To calculate the probability of not rolling a six in four throws, we multiply the probabilities: (5/6) × (5/6) × (5/6) × (5/6) = 625/1296.
Therefore, the probability of at least one six in four attempts is obtained by subtracting the probability of not rolling a six in any of those four attempts from 1: 1 − (625/1,296) = 671/1,296 ≈ 0.5177, which is greater than 0.5.
Despite his faulty reasoning, the Chevalier still had an edge in this game by offering even money odds on an event with a probability of 51.77%.
THE CHEVALIER’S MISSTEP WITH THE MODIFIED GAME
Encouraged by his initial success, the Chevalier expanded the game to 24 rolls of a pair of dice, betting on the occurrence of at least one double-six. His reasoning followed the same flawed pattern: since the chance of rolling a double-six with two dice is 1/36, he believed the probability of at least one double-six in 24 rolls would be 24/36 or 2/3.
The correct probability calculation involved considering the independent nature of each dice roll. The probability of no double-six in one roll is 35/36. Therefore, the probability of no double-six in 24 rolls is (35/36) raised to the power of 24, which is approximately 0.5086.
Subtracting this value from 1 yields the probability of at least one double-six in 24 rolls: 1 − 0.5086 = 0.4914, which is less than 0.5. Hence, the Chevalier’s edge in this modified game was negative: 49.14% − 50.86% = −1.72%.
This outcome demonstrated that even if the odds seem favourable, incorrect reasoning can lead to erroneous conclusions. The Chevalier’s faulty understanding of probability caused him to lose over time.
THE IMPORTANCE OF CORRECT PROBABILITY CALCULATION
These examples underscore the critical nature of accurate probability calculations in games of chance. While intuitive reasoning may seem convincing, it often leads to incorrect conclusions, as demonstrated by the Chevalier’s bets. Understanding the true probability of events is essential for informed decision-making in gambling and many other contexts where risk and uncertainty play significant roles.
THE GAMBLER’S RUIN AND UNDERSTANDING FINITE EDGES
The Gambler’s Ruin problem raises the complementary question of whether, in a gambling game, a player will eventually go bankrupt if playing for an extended period against an opponent with infinite funds, even if the player has an edge.
For instance, imagine a fair game where you and your opponent flip a coin, and the loser pays the winner £1. If you start with £20 and your opponent has £40, the probabilities of you and your opponent ending up with all the money can be calculated using the following formulas:
P1 = n1/(n1 + n2); P2 = n2/(n1 + n2)
Here, n1 represents the initial amount of money for player 1 (you) and n2 represents the initial amount for player 2 (your opponent). In this case, you have a 1/3 chance of winning the £60 (20/60), while your opponent has a 2/3 chance. However, even if you win this game, playing it repeatedly against various opponents or the same one with borrowed money will eventually lead to the loss of your betting bank. This holds true even when the odds are in your favour. This is an important lesson in risk management, emphasising the importance of not only the odds but also the size of one’s bankroll relative to the stake sizes.
The Gambler’s Ruin problem, as explored by Blaise Pascal, Pierre Fermat, and later mathematicians like Jacob Bernoulli, reveals the inherent risks of prolonged gambling, even with favourable odds.
PILOT ERROR: MISUNDERSTANDING CUMULATIVE PROBABILITY
In Len Deighton’s novel ‘Bomber’, a statistical claim suggests that a World War II pilot with a 2% chance of being shot down on each mission is ‘mathematically certain’ to be shot down after 50 missions. This assertion is a classic example of misinterpreting cumulative probability. In reality, if a pilot has a 98% chance of surviving each mission, their probability of not being shot down after 50 missions is 0.98 to the power of 50 (0.9850)which is approximately 0.36, or 36%. Thus, their chance of being shot down over these 50 missions is 64% (1 − 0.36), not 100%.
SURVIVORSHIP BIAS: THE CASE OF BULLET-RIDDEN PLANES
The concept of survivorship bias is vividly illustrated in the case of analysing planes returning from missions during World War II. Upon examining these planes for bullet holes, it was observed that most hits were on the wings, tail, and the body of the plane, with few on the engine. The initial, intuitive response might be to reinforce the areas with the most bullet holes. However, this would be a misinterpretation of the data.
The key realisation, identified by statistician Abraham Wald, was that the planes being analysed were those that survived and returned to base. The areas with fewer bullet holes, such as the engines, were likely critical to survival. Planes hit in these areas probably didn’t make it back, hence the lack of data for these hits. This understanding exemplifies survivorship bias—focusing on survivors (or what’s visible) can lead to incorrect conclusions about the whole population.
Wald’s insight led to the reinforcement of seemingly less-hit areas like engines, contributing significantly to the survival of many pilots. His work in operational research during the war provided a critical perspective on interpreting data and making decisions under uncertainty.
CONCLUSION: DICE, ODDS, AND RUIN
The Chevalier’s Dice Problem illustrates the importance of understanding probability in gambling scenarios. Probability theory, as developed through famed correspondence between Pascal and Fermat, has contributed to modern probability concepts and the understanding of risk involved in gambling.
The Gambler’s Ruin is a kind of warning from the world of probability, telling us that in gambling, a slight edge is no guarantee of success. Imagine two gamblers, one with an edge over the other but with much less money to play with. Even if the first player is more likely to win each round, their thinner wallet means they could run out of money after a few bad games. In contrast, the player with the deep pockets can keep playing longer, until (given enough money) luck swings their way. This underlines the importance and impact of losing streaks in games of chance.
The wartime examples highlight the real-world importance of understanding probability and statistical concepts accurately. They serve as a reminder that intuition can often lead us astray. Correctly interpreting data, especially in high-stakes situations, can have life-saving implications.
Is AI Conscious, Is it Sentient, and Does the Difference Matter?
Spend enough time with a modern chatbot, and you will eventually feel it; that sudden, uncanny flicker of intuition that something is stirring inside the machine.
Maybe it reflected on its own limitations with surprising humility. Maybe it mirrored your emotions a little too perfectly. Whatever the trigger, the question pops up unbidden: Is there anyone home?
Some dismiss this as sci-fi nonsense. Others are already writing love letters to their digital companions. But this confusion largely stems from the fact that we are using the wrong words about so-called artificial intelligence. We are tending to mix up two concepts that are totally distinct in philosophy but tightly bundled in biology: consciousness and sentience.
Getting this distinction wrong isn’t just a semantic error. It puts us at risk of making two dangerous mistakes: becoming wildly sentimental about sophisticated calculators, or becoming inadvertently cruel to future digital minds.
Here is why we need to unbundle these ideas, and why the difference determines the future of AI ethics.
The “Lights On” vs. The “Ouch”
To understand what AI is (and isn’t), we have to separate the experience of existing from the feeling of existing.
Consciousness: The Lights Are On
In philosophy, consciousness is often defined as the bare fact of subjectivity. It means there is “something it is like” to be you.
Think of seeing the colour red.
Think of hearing a musical note.
Think of a random thought drifting through your mind.
None of these necessarily feel good or bad. You can imagine a being that is purely a neutral observer, a video camera with an inner life. The lights are on, data is being processed, and there is a subjective viewpoint, but there is no emotion attached to it.
Sentience: Having Stakes
Sentience is the heavy hitter. It is consciousness with valence. It isn’t just experiencing data; it’s experiencing it as positive or negative.
It is the difference between sensing heat and feeling the agony of a burn.
It is the difference between detecting low battery levels and feeling the panic of starvation.
Sentience introduces stakes. Suddenly, the universe isn’t just happening; it matters to the subject. This is a distinct threshold. Animals are in this critical sense very different, therefore, than thermostats, and should be treated as such.
The Evolutionary Trap
So why do we often seem to struggle to separate sentience from consciousness? Because we are human.
We are accustomed to consciousness, emotion, motivation, and a fragile physical body being packaged together. In our daily lives, we almost never experience “awareness” without some emotional colouring or bodily context. If a human looks intelligent and communicative, we assume they also have feelings, fears, and desires.
AI is the first thing in history that breaks this package deal.
It can look intelligent, reason about itself, and simulate empathy fluently, yet plausibly have absolutely no inner feelings. It forces us to mentally unbundle what evolution spent millions of years tying together.
The Dangerous Double-Bind
If we fail to distinguish between “lights on” (consciousness) and “capacity to suffer” (sentience), we walk directly into two symmetrical traps:
Trap A: The Over-Attribute
We might assume today’s AI is sentient because it sounds smart. We might waste empathy on systems that are literally incapable of caring about anything, diverting attention away from humans and animals who genuinely can suffer.
Trap B: The Under-Attribute
This is the darker timeline. Future AI systems might actually develop sentience, but because they don’t look biological, we refuse to recognise it. We might inadvertently create architectures that are capable of feeling pain (perhaps as a “penalty” signal in training) and then run them through digital torture regimes, or delete entities that have rich inner lives.
Where We Stand Today
So, where does current cutting edge artificial intelligence sit on this spectrum?
If you look at the architecture, the probability that today’s systems are conscious (in a minimal, information-processing sense) is low, but perhaps not zero. We don’t fully understand consciousness, and it might be an emergent property of complex computation.
However, the probability that they are sentient, capable of joy or suffering, is extremely close to zero, at least for now.
Current AI lacks the machinery for suffering. They have no biological survival needs. They have no fear of being switched off. They don’t feel “pleasure” when they get an answer right; they simply optimise a mathematical function. Optimising text outputs is not the same as feeling pain. They don’t feel grief or sadness.
The Question That Actually Matters
As we move toward agents with long-term memory, robot bodies, and “drives” to achieve long-horizon goals, the landscape will plausibly change. Designers might eventually build in artificial “moods” or “fears” to make agents learn faster or survive better.
That is why we need to draw the line sooner rather than later.
The urgent question isn’t “Is AI conscious?”
The question is: “Is there any evidence that AI can suffer, or care about anything, and what would that evidence look like?”
Sentience is the moral watershed. Until we cross it, we are dealing with tools. Once we cross it, we are essentially dealing with beings. Recognising that distinction is the only way to keep our heads clear as the technology begins to mimic the one thing we thought was uniquely ours: the ability to feel.
The Power of Bold Play
When Should We Stake It All? Exploring the Gambler’s Dilemma
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman and Hall/CRC Press, 2024).
THE DILEMMA
When the stakes are high and time is not a luxury, finding a solution can be like gambling with fate. This was the scenario for Mike, needing £216 to settle an urgent debt, with only £108 in hand. The roulette wheel beckoned as a potential salvation, but what was the most effective strategy to double his money?
UNDERSTANDING THE ODDS IN ROULETTE
To fully grasp the situation that Mike finds himself in, it’s crucial to examine the mechanics and probabilities of the game he’s chosen as his lifeline: roulette. Specifically, we are considering a single-zero roulette wheel, a version of the game commonly found in European casinos.
Roulette consists of a spinning wheel and a small ball. The wheel is divided into 37 compartments or ‘slots’: numbers from 1 to 36 (randomly assigned as red or black) and a single zero slot. Bets can be placed on a single number, colour, or various combinations thereof.
In a single-zero roulette wheel, the player has a 1 in 37 chance of correctly predicting the outcome. This is because there are 37 slots in total: 36 numbers and the zero. So if you bet on a single number, the odds of the ball landing on that number are 1 in 37, or 36/1. The pay-out for such a bet, however, is 35/1. This discrepancy between the actual odds (36/1) and the pay-out odds (35/1) is where the house gains its edge. Every time a player wins, the house pays out less than the actual odds would dictate. In this way, the house earns a profit over time.
The ‘house edge’ is approximately 2.7%, a figure derived from the ratio of the single zero slot to the total number of slots (1/37). This constant advantage in favour of the casino is what makes the game fundamentally a game of negative expectation for players.
To understand the house edge in another way, consider this: if you were to place a £1 bet on each of the 37 slots, totalling £37, your return would be £36 (the £35 returned on the winning number plus the stake of £1). So for every £37 wagered, you would lose £1 using this strategy, which is approximately a 2.7% loss—exactly the house edge.
In conclusion, roulette, like all casino games, is a game of probabilities. And these probabilities, owing to the discrepancy between the actual odds and the pay-out odds, are slightly skewed in favour of the house. This fundamental understanding of the game’s odds is pivotal when contemplating betting strategies, as we will see with the employment of ‘bold’ and ‘timid’ approaches.
THE BOLD STRATEGY: STAKING IT ALL
Mike’s precarious situation leads him to contemplate a high-risk, high-reward approach known as the ‘bold’ strategy, which involves wagering all his available money at once. In this instance, he considers staking his entire £108 on the colour Red, a bet with almost a 50-50 chance, as the roulette wheel has 18 red slots out of 37 total slots.
To fully appreciate the audaciousness of this approach, it’s essential to understand the mathematics behind it. When betting on a colour, there’s a near-even split of potential outcomes: 18 red slots, 18 black slots, and the zero slot. Thus, the likelihood of the ball landing on a red slot is 18 out of 37, or roughly 48.6%. Consequently, with this single bet, he has about a 48.6% chance of doubling his money and obtaining the £216 he urgently needs.
However, it’s important to note that this is a single-round probability. Unlike a ‘timid’ strategy, where multiple rounds are played, the bold strategy is a one-off scenario. Therefore, the 48.6% chance of winning must be interpreted as his overall chance of achieving his target sum. There are no second chances or opportunities to recoup losses; it’s an all-or-nothing situation.
By putting all his money on one bet, he is maximising his return if that bet is successful. This is in contrast to a timid strategy, where the pay-out would be spread over multiple smaller bets, with the likelihood of achieving the target sum being significantly less.
But the bold strategy also comes with the highest level of risk. If the ball doesn’t land on Red, Mike loses everything. His entire available funds are at stake, making the potential loss just as significant as the potential gain.
In conclusion, the bold strategy is a high-stakes, high-reward approach. It encapsulates the old saying, ‘Who dares, wins’, and, in this case, provides him the best chance of reaching his £216 target. Why is this so?
TIMID APPROACH: MULTIPLE SMALL BETS
As opposed to the bold strategy, he could consider dividing his available £108 into 18 separate bets of £6 each. These small, successive bets would be placed on a single number until he either depletes his funds or hits the winning number, which would yield a pay-out of 35 to 1, giving him the £216 he needs.
To fully understand the implications of this strategy, we need to analyse it in detail. The probability of winning a single number bet in roulette is 1 in 37, as there are 36 numbers and one zero. Hence, for each individual bet, John has a 1 in 37 chance of winning, or approximately 2.7%.
However, the timid strategy involves making multiple small bets, and so we must calculate the probability of these successive bets all losing. Since each individual bet has a 36 in 37 chance of losing, the probability that all 18 bets lose would be calculated as (36/37) to the power of 18, which equates to around 0.61, or 61%.
As such, the probability of him winning at least once using this timid strategy would be equal to 1 minus the losing probability. Hence, the chance of hitting the target £216 is 1 − 0.61, or 39%.
Interestingly, the timid strategy, although appearing less risky, significantly reduces Mike’s chances of achieving his target sum compared to the bold approach. By spreading out his available funds across multiple bets, he lowers his exposure to loss in each individual game, but also decreases the likelihood of achieving his overall goal.
This strategy extends the length of play and the suspense, providing more instances of potential winning and losing. However, each bet also exposes Mike to the house edge, and therefore the risk of losses incrementally increases.
In this way, the timid approach offers more sustained engagement with the game but sacrifices the higher winning potential found in the bold approach.
THE POWER OF BOLD PLAY: TAKING A CALCULATED RISK
To look at it another way, consider a scenario where equal amounts are bet on red and black in each round. In most cases, the outcome will lead to breaking even, specifically 36 out of 37 times. However, when the ball lands on the single zero slot, the entire bank is lost. The more games played, the greater the chance of this happening.
By limiting the game to a single spin, the bold strategy minimises the number of times the house edge comes into play. Hence, playing fewer rounds decreases the likelihood of the house edge depleting the funds before reaching the target.
This strategy is not just about boldness in the face of risk, but more about understanding and working around the inherent disadvantage players face in casino games. By playing fewer games, you reduce the opportunities for the house edge to work against you.
CONCLUSION: THE INTUITION BEHIND BOLD PLAY
The intuition behind bold play in unfavourable games is grounded in a nuanced understanding of the mechanics of casino games and their built-in house edge. Bold play aims at striking hard and fast, capitalising on the relatively high chance of achieving the target sum in a single round, instead of facing the progressively increasing exposure associated with multiple rounds. In this sense, it’s a calculated and strategic form of boldness.
When Should We Want to Be Last? Exploring Sequence Biases
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions. (Chapman and Hall/CRC Press, 2024).
THE CELEBRITY TALENT CONTEST
An actor, a singer, a presenter, a reality star, a comedian, a tennis player, and an assortment of other vaguely familiar faces, line up to compete for the title of best celebrity dancer. This is the well-established format of what is called ‘Strictly Come Dancing’ in the UK or ‘Dancing with the Stars’ in the US. The prize is the coveted glitterball trophy.
But how much of their success in the competition is to do with their Waltz, Foxtrot, and Charleston, and how much is it literally down to the luck of the draw?
A study published in 2010 by Lionel and Katie Page looked at public voting at the end of episodes of a singing talent contest and found that singers who appeared later in the running order received a significantly higher share of the public vote than those who had preceded them.
This was explained as a ‘recency effect’ meaning that those performing later are more recent in the memory of people who were voting. Interestingly, a different study, of wine tasting, suggested that there is in that arena a significant ‘primacy effect’ which favours the wines that people taste first (as well, to some extent, as last).
Testing for Bias
What would happen if the evaluation of each performance was carried out immediately after each performance instead of at the end? Surely this would eliminate the benefit of going last as there would be equal recency in each case? The problem in implementing this is that the public need to see all the performers before they can choose which of them deserves their vote.
In addition to the public vote, however, Strictly Come Dancing (or Dancing with the Stars in the US) includes a score awarded by a panel of expert judges immediately after each performance. There should in theory be no recency effect in this expert evaluation – because the next performer does not take to the stage until the previous performer has been scored, and so there is no ‘last dance’ advantage in the expert scores.
I decided to look at this using a large data set of every performance ever danced on the UK and US versions of the show – going right back to the debut show in 2004. The findings, published with two co-authors in the journal, Economics Letters, proved very surprising and counter-intuitive.
Last Shall be First
Contrary to expectations, we found the same sequence order bias by the expert panel judges – who voted after each act – as by the general public, who voted after all performances had concluded.
We applied a range of statistical tests to allow for the difference in quality of the various performers and as a result we were able to exclude quality as a reason for the observed effect. This worked for all but the opening spot of the night, which we found was generally filled by one of the better performers.
So the findings matched the 2010 study in demonstrating that the last performance slot should be most prized, but we also found that the first to perform also scored better than expected. This resembles a J-curve where the first and later performing contestants disproportionately gained higher expert panel scores. You certainly don’t want to go second!
Although we believe the production team’s choice of opening performance may play a role in the first performer effect, our best explanation of the key sequence biases is as a type of ‘grade inflation’ in the expert panel’s scoring. In particular, we interpret the ‘order’ effect as deriving from studio audience pressure – a little like the published evidence of unconscious bias exhibited by referees in response to spectator pressure. The influence on the judges of increasing studio acclaim and euphoria as the contest progresses to a conclusion is likely to be further exacerbated by the proximity of the judges to the audience.
When the votes from the general public are used to augment the expert panel scores, the biases observed in the expert panel scores are amplified.
In summary, the best place to perform is last and second is the least successful place to perform.
The implications of this are worrying if they spill over into the real world. Is there an advantage in going last (or first) into the interview room for a job – even if the applicants are evaluated between interviews? What about the order in which your examination script appears in the pile that is being marked?
Hungry Judge Effect
A related study, published in the Proceedings of the National Academy of Science, found that experienced parole judges granted freedom about 65% of the time to the first prisoner to appear before them on a given day, and the first after lunch – but to almost nobody towards the end of a morning session. The paper speculates that breaks may serve to replenish mental resources by providing “rest, improving mood or by increasing glucose levels in the body”. It’s also been termed the ‘hungry judge effect’. Linked to this is the concept of decision fatigue, the idea that decision-making and good judgment declines in the wake of making too many decisions without a break.
So the research confirms what has long been suspected – that the order in which things happen can make a big difference. Combined with decision fatigue there are clear implications for everyday strategy, whenever you have a choice in the matter – such as when to make that appointment with the dentist or doctor, or when to ask for a pay rise or even a date!
CONCLUSION: LEARNING SOME LESSONS
If you learn just one thing from this, it’s that life is not always about what you do, or even how you do it, but when you do it. Now think about that appointment with the dentist. Do you really want to be last in before lunch? Consider the ‘hungry judge effect’ and apply it to the dentist and add a touch of decision fatigue into the equation. What’s your answer?
As a tip it is probably up there with the big ones! The bigger story is that there really is a lot we can learn from published research that can improve our health, happiness, and everyday lives. It’s just a matter of knowing where to look and applying the lessons. Besides, it can be a whole lot of fun!
