Skip to content

Quantum World Thought Experiments – Guide Notes.

Is it possible to be both alive and dead at the same time? This is the question central to the famous Schrodinger’s Cat thought experiment. In the version posed by Erwin Schrodinger, a cat is placed in an opaque box for an hour with a small piece of radioactive material which has an equal probability of decaying or not in that time period. If some radioactivity is detected by a Geiger counter also placed in the box, a relay releases a hammer which breaks a flask of hydrocyanic acid, killing the cat. If no radioactivity is detected, the cat lives. Before we open the box at the end of the hour, we estimate the chance that the radioactive material will decay and the cat will be dead at 50/50, the same as that it will be alive. Before we open the box, however, is the cat alive (and we don’t know it yet), dead (and we don’t know it yet) or both alive and dead (until we open the box and find out).

Common sense would seem to indicate that it is either alive or dead, but we don’t know until we open the box. Traditional quantum theory suggest otherwise. The cat is both alive, with a certain probability, and dead, with a certain probability, until we open the box and find out, when it has to become one or the other with a probability of 100 per cent. In quantum terminology, the cat is in a superposition (two states at the same time) of being alive and dead, which only collapses into one state (dead or alive) when the cat is observed. This might seem absurd when applied to a cat. After all surely it was either alive or dead before we opened the box and found out. It was simply that we didn’t know which. That may be true, when applied to cats. But when applied to the microscopic quantum world, such common sense goes out the window as a description of reality. For example, photons (the smallest measure of light) can exist simultaneously in both wave and particle states, and travel in both clockwise and anti-clockwise directions at the same time. Each state exists in the same moment. As soon as the photon is observed, however, it must settle on one unique state. In other words, the common sense that we can apply to cats we cannot apply to photons or other particles at the quantum level.

So what is going on? The traditional explanation as to why the same quantum particle can exist in different states simultaneously is known as the Copenhagen Interpretation. First proposed by Niels Bohr in the early twentieth century, the Copenhagen interpretation states that a quantum particle does not exist in any one state but in all possible states at the same time, with various probabilities. It is only when we observe it that it must in effect choose which of these states it exists as. At the sub-atomic level, then, particles seem to exist in a state of what is called ‘coherent superposition’, in which they can be two things at the same time, and only become one when they are forced to do so by the act of being observed. The total of all possible states is known as the ‘wave function.’ When the quantum particle is observed, the superposition ‘collapses’ and the object is forced into one of the states that make up its wave function.

The problem with this explanation is that all these different states exist. By observing the object, it might be that it reduces down to one of these states, but what has happened to the others? Where have they disappeared to?

This question lies at the heart of the so-called ‘Quantum Suicide’ thought experiment.

It goes like this. A man (not a cat) sits down in front of a gun which is linked to a machine that measures the spin of a quantum particle (a quark). If it is measured as spinning clockwise, the gun will fire and kill the man. If it is measured as spinning anti-clockwise, it will not fire and the man will survive to undergo the same experiment again.

The question is – will the man survive, and how long will he survive for? This thought experiment, proposed by Max Tegmark, has been answered in different ways by quantum theorists depending on whether or not they adhere to the Copenhagen Interpretation. In that interpretation, the gun will go off with a certain probability, depending on which way the quark is spinning. Eventually, by the laws of chance, the man will be killed, probably sooner rather than later. A growing number of theorists believe something else, however. They see both states (the particle is spinning clockwise and spinning anti-clockwise) as equally real, so there are two real outcomes. In one world, the man dies and in the other he lives. The experiment repeats, and the same split occurs. In one world there will exist a man who survives an indefinite number of rounds. In the other worlds, he is dead.

The difference between these alternative approaches is critical. The Copenhagen approach is to propose that the simultaneously existing states (for example, the quark that is spinning both clockwise and anti-clockwise simultaneously) exist in one world, and collapse into one of these states when observed. Meanwhile, the other states mysteriously disappear. The other approach is to posit that these simultaneously existing states are real states, and neither magically disappears, but branch off into different realities when observed. What is happening is that in one world, the particle is observed spinning clockwise (in the Quantum Suicide thought experiment, the man dies) and in the other world the particle is observed spinning the other way (and the man lives). Crucially, according to this interpretation both worlds are real. In other words, they are not notional states of one world but alternative realities. This is the so-called ‘Many Worlds Theory.’

Where is the burden of proof in trying to determine which interpretation of reality is correct? This depends on whether we take the one world that we can observe as the default position or the wave function of all possible states as represented in the mathematics of the wave function as the reality. Adherents to the Many Worlds position argue that the default is to go with what is described in the mathematics underpinning quantum theory – that the wave function represents all of reality. According to this argument, the minimal mathematical structure needed to make sense of quantum mechanics is the existence of many worlds which branch off, each of which contains an alternative reality. Moreover, these worlds are real. To say that our world, the one that we are observing, is the only real one, despite all the other possible worlds or measurement outcomes, has been likened to when we believed that the Earth was at the centre of the universe. There is no real justification, according to this interpretation, for saying that our branch of all possible states is the only real one, and that all other branches are non-existent or are ‘disappeared worlds.’ Put another way, the mathematics of quantum mechanics describes these different worlds. Nothing in the maths says that this world that we observe is more real than another world. So the burden of proof is on those who say it is. The viewpoint of the Copenhagen school is diametrically opposite. They argue that the hard evidence is of the world we are in, and the burden of proof is on those positing other worlds containing other branches of reality.

Depending on which default position we choose to adopt will determine whether we are adherents of the Copenhagen or the ‘Many Worlds’ schools.

For me personally, the logic of the argument points to the Many Worlds school. But to believe that they are right, and the Copenhagen school is wrong, seems kind of crazy, and totally counter-intuitive. In another world, of course, I’m probably saying the exact opposite.

Exercise

Consider the main strength and weakness of the ‘Many Worlds’ interpretation of reality.

References and Links

Do Parallel Universes Really Exist? HowStuffWorks. https://science.howstuffworks.com/science-vs-myth/everyday-myths/parallel-universe.htm

How Quantum Suicide Works. HowStuffWorks. https://science.howstuffworks.com/innovation/science-questions/quantum-suicide.htm

 

The ‘Simulated World’ Problem – Guide Notes.

Do we live in a simulation, created by an advanced civilisation, in which we are part of some sophisticated virtual reality experience? For this to be a possibility we can make the obvious assumption that sufficiently advanced civilisations will possess the requisite computing and programming power to create what philosopher Nick Bostrom termed such ‘ancestor simulations’. These simulations would be complex enough for the minds that are simulated to be conscious and able to experience the type of experiences that we do. The creators of these simulations could exist at any stage in the development of the universe, even billions of years into the future.

The argument around simulation goes like this. One of the following three statements must be correct.

  1. That civilisations at our level of development always or almost always disappear before becoming technologically advanced enough to create these simulations.
  2. That the proportion of these technologically advanced civilisations that wish to create these simulations is zero or almost zero.
  3. That we are almost sure to be living in such a simulation.

To see this, let’s examine each proposition in turn.

  1. Suppose that the first is not true. In that case, a significant proportion of civilisations at our stage of technology go on to become technologically advanced enough to create these simulations.
  2. Suppose that the second is not true. In this case, a significant proportion of these civilisations run such simulations.
  3. If both of the above propositions are not true, then there will be countless simulated minds indistinguishable to all intents and purposes from ours, as there is potentially no limit to the number of simulations these civilisations could create. The number of such simulated minds would almost certainly be overwhelmingly greater than the number of minds that created them. Consequently, we would be quite safe in assuming that we are almost certainly inside a simulation created by some form of advanced civilisation.

For the first proposition to be untrue, civilisations must be able to go through the phase of being able to wipe themselves out, either deliberately or by accident, carelessness or neglect, and never or almost never do so. This might perhaps seem unlikely based on our experience of this world, but becomes more likely if we consider all other possible worlds.

For the second proposition to be untrue, we would have to assume that virtually all civilisations that were able to create these simulations would decide not to do so. This again is possible, but would seem unlikely.

If we consider both propositions, and we think it is unlikely that no civilisations survive long enough to achieve what Bostrom calls ‘technological maturity’, and that it is unlikely that hardly any would create ‘ancestor simulations’ if they could, then anyone considering the question is left with a stark conclusion. They really are living in a simulation.

To summarise. An advanced ‘technologically mature’ civilisation would have the capability of creating simulated minds. Based on this, at least one of three propositions must be true.

  1. The proportion of these advanced civilisations is close to zero or zero.
  2. The proportion of these advanced civilisations that wish to run these simulations is close to zero.
  3. The proportion of those consciously considering the question who are living in a simulation is close to one.

If the first of these propositions is true, we will almost certainly not survive to become ‘technologically mature.’ If the second proposition is true, virtually no advanced civilisations are interested in using their power to create such simulations. If the third proposition is true, then conscious beings considering the question are almost certainly living in a simulation.

Through the veil of our ignorance, it might seem sensible to assign equal credence to all three, and to conclude that unless we are currently living in a simulation, descendants of this civilisation will almost certainly never be in a position to run these simulations.

Strangely indeed, the probability that we are living in a simulation increases as we draw closer to the point at which we are able and willing to do so. At the point that we would be ready to create our own simulations, we would paradoxically be at the very point when we were almost sure that we ourselves were simulations. Only by refraining to do so could we in a certain sense make it less likely that we were simulated, as it would show that at least one civilisation that was able to create simulations refrained from doing so. Once we took the plunge, we would know that we were almost certainly only doing so as simulated beings. And yet there must have been someone or something that created the first simulation. Could that be us, we would be asking ourselves? In our simulated hearts and minds, we would already know the answer!

 

Exercise

With reference to Bostrom’s ‘simulation’ reasoning, generate an estimate as to the probability that we are living in a simulated world.

References and Links

The Simulation Argument. https://www.simulation-argument.com/

Do we live in a computer simulation? Nick Bostrom. New Scientist. 00Month 2006. 8-9. https://www.simulation-argument.com/computer.pdf

Are you living in a computer simulation? Bostrom, N. Philosophical Quarterly (2003). 53, 211. 243-255.

Click to access simulation.pdf

Bayes and the Beetle – in a nutshell.


An entomologist spots what might be a rare category of beetle, due to the pattern on its back. In the rare category, 98% have the pattern. In the common category, only 5% have the pattern. The rare category accounts for only 0.1% of the population. How likely is the beetle to be rare?

 

Since only 5 per cent of the common beetles bear the distinctive pattern and 98 per cent of the rare beetles do, intuition would tell you that you have come across a rare insect when you espy the pattern. Bayes’ Theorem tells you something quite different.

 

To calculate just how likely the beetle is to be rare given that we see the pattern on its back, we apply Bayes’ Theorem.

Posterior probability = ab/ [ab+c (1-a)]

a is the prior probability of the hypothesis (beetle is rare) being true. b is the probability we observe the pattern and the beetle is rare (hypothesis is true). c is the probability we observe the pattern and the beetle is not rare (hypothesis is false).

 

In this case, a = 0.001 (0.1%); b = 0.98 (98%); c = 0.05 (5%).

 

So, updated probability = ab/ [ab+c (1-a)] = 0.0192. So there is just a 1.92 per cent chance that the beetle is rare when the entomologist spots the distinctive pattern on its back.

Why the counterintuitive result? Because so few of the population of all beetles are rare, i.e. the prior probability that the beetles is rare is almost vanishingly small and it would take a lot more evidence than that acquired to make a reasonable case for the beetle being rare.

So what is the probability that the beetle is rare given that we observe the distinctive pattern? In other words, what is the probability that the hypothesis (the beetle is rare) is true given the evidence (the pattern). That is 1.92 per cent. What is the probability that we will observe the distinctive pattern if the beetle is rare? In other words, what is the probability of observing the evidence (the pattern) if the hypothesis (the beetle is rare) is true. That is 98 per cent.

To conflate these, to believe these two concepts are the same, is to commit the classic Prosecutor’s Fallacy, i.e. to falsely equate the probability that the defendant is guilty given the observed evidence with the probability of observing the evidence given that the defendant is guilty. It’s a potentially very dangerous fallacy to commit, especially when you happen to be the defendant and the jury has never heard of the Reverend Thomas Bayes.

 

Appendix

We can also solve the Beetle problem using the traditional notation version of Bayes’ Theorem.

P (HIE) = P (EIH). P (H) / [P (EIH) . P(H) + P (EIH’) . P(H’)]

In this case, P (H) = 0.001 (0.1%); P (EIH) = 0.98 (98%); P (EIH’) = 0.05 (5%).

So, P (HIE) = 0.98 x 0.001/ [0.98 x 0.001 +0.05 x 0.999)] = 0.00098 / 0.00098 + 0.04995 = 0.00098 / 0.05093 = 0.0192. So there is just a 1.92 per cent chance that the beetle is rare when the entomologist spots the distinctive pattern on its back.

Note also that P (HIE) = 0.0192, while P (EIH) = 0.98.

The Prosecutor’s Fallacy is to conflate these two expressions.

 

Exercise

An entomologist spots what might be a rare category of beetle, due to the pattern on its back. In the rare category, 95% have the pattern. In the common category, only 2% have the pattern. The rare category accounts for only 0.5% of the population. How likely is the beetle to be rare?

 

References and Links

CS201 – Bayes’ Theorem – Excerpts from Wikipedia

Click to access BayesTheorem.pdf

Jeff Thompson. Bayes’ Theorem. November 20, 2011. https://www.jeffreythompson.org/blog/2011/11/20/bayes-theorem/

The Strange Case of Sunrise, Sunset and the Shortest Day of the Year.

December 21st, 2018 is the shortest day of the year, at least in the UK, located in the Northern hemisphere of our planet.

So does that mean that the mornings should start to get lighter after today (earlier sunrise), as well as the evenings (later sunset). Not so, and there’s a simple reason for that. The length of a solar day, i.e. the period of time between the solar noon (the time when the sun is at its highest elevation in the sky) on one day and the next, is not 24 hours in December, but about 30 seconds longer than that.

For this reason, the days get progressively about 30 seconds longer throughout December, so that by the end of the month a standard 24-hour clock is lagging roughly 15 minutes behind real solar time.

Let’s say just for a moment that the hours of sunlight (the time difference between sunrise and sunset) stayed constant through December. This means that a 24-hour clock which timed sunset at 3.50pm one day would be 30 seconds slow by 3.50pm the next day. The solar day would be 30 seconds longer than this, so the sun would not set the next day till 3.50pm and 30 seconds. After ten days the sun would not set till 3.55pm according to the 24-hour clock. So the sunset would actually get later through all of December. For the same reason, the sunrise would get later through the whole of December.

In fact, the sunset doesn’t get progressively later through all of December because the hours of sunlight shorten for about the first three weeks. The effect of this is that the sun would set earlier and rise later.

These two things (the shortening hours of sunlight and the extended solar day) work in the opposite direction. The overall effect is that the sun starts to set later from a week or so before the shortest day, but doesn’t start to rise earlier till about a week or so after the shortest day.

So the old adage that that the evenings will start to draw out after the end of the third week of December or so, and the mornings will get lighter, is false. The evenings have already been drawing out for several days before the shortest day, and the mornings will continue to grow darker for several days more.

There’s one other curious thing. The solar noon coincides with noon on our 24-hour clocks just four times a year. One of those days is Christmas Day! So set your clock to noon on December 25th, look up to the sky and you will see the sun at its highest point. Just perfect!

 

Links

http://www.timeanddate.com/astronomy/uk/nottingham

http://www.bbc.co.uk/news/magazine-30549149

http://www.rmg.co.uk/explore/astronomy-and-time/time-facts/the-equation-of-time

http://en.wikipedia.org/wiki/Solar_time

http://earthsky.org/earth/everything-you-need-to-know-december-solstice

The US mid-term elections: a triumph for political forecasting.

The results of the US midterm elections are now largely in and they came as a shock to many seasoned forecasters.

This wasn’t the kind of shock that occurred in 2016, when the EU referendum tipped to Brexit and the US presidential election to Donald Trump. Nor the type that followed the 2015 and 2017 UK general elections, which produced a widely unexpected Conservative majority and a hung parliament respectively.

On those occasions, the polls, pundits and prediction markets got it, for the most part, very wrong, and confidence in political forecasting took a major hit. The shock on this occasion was of a different sort – surprise related to just how right most of the forecasts were.

Take the FiveThirtyEight political forecasting methodology, most closely associated with Nate Silver, famed for the success of his 2008 and 2012 US presidential election forecasts.

In 2016, even that trusted methodology failed to predict Trump’s narrow triumph in some of the key swing states. This was reflected widely across other forecasting methodologies, too, causing a crisis of confidence in political forecasting. And things only got worse when much academic modelling of the 2017 UK general election was even further off targetthan it had been in 2015.

How did it go so right?

So what happened in the 2018 US midterm elections? This time, the FiveThirtyEight “Lite” forecast, based solely on local and national polls weighted by past performance, predicted that the Democrats would pick up a net 38 seats in the House of Representatives. The “Classic” forecast, which also includes fundraising, past voting and historical trends, predicted that they would pick up a net 39 seats. They needed 23 to take control.


Read more: Women candidates break records in the 2018 US midterm elections


With almost all results now declared, it seems that those forecasts are pretty near spot on the projected tally of a net gain of 40 seats by the Democrats. In the Senate, meanwhile, the Republicans were forecast to hold the Senate by 52 seats to 48. The final count is likely to be 53-47. There is also an argument that the small error in the Senate forecast can be accounted for by poor ballot design in Florida, which disadvantaged the Democrat in a very close race.

Some analysts currently advocate looking at the turnout of “early voters”, broken down by party affiliation, who cast their ballot before polling day. They argue this can be used as an alternative or supplementary forecasting methodology. This year, a prominent advocate of this methodology went with the Republican Senate candidate in Arizona, while FiveThirtyEight chose the Democrat. The Democrat won. Despite this, the jury is still out over whether “early vote” analysis can add any value.

There has also been research into the forecasting efficiency of betting/prediction markets compared to polls. This tends to show that the markets have the edge over polls in key respects, although they can themselves be influenced by and overreact to new poll results.

There are a number of theories to explain what went wrong with much of the forecasting prior to the Trump and Brexit votes. But looking at the bigger picture, which stretches back to the US presidential election of 1868 (in which Republican Ulysses S Grant defeated Democrat Horatio Seymour), forecasts based on markets (with one notable exception, in 1948) have proved remarkably accurate, as have other forecasting methodologies. To this extent, the accurate forecasting of the 2018 midterms is a return to the norm.

And the next president is …

But what do the results mean for politics in the US more generally? The bottom line is that there was a considerable swing to the Democrats across most of the country, especially among women and in the suburbs, such that the Republican advantage of almost 1% in the House popular vote in 2016 was turned into a Democrat advantage of about 8% this time. If reproduced in a presidential election, it would be enough to provide a handsome victory for the candidate of the Democratic Party.


The size of this swing, and the demographics underpinning it, were identified with a good deal of accuracy by the main forecasting methodologies. This success has clearly restored some confidence in them, and they will now be used to look forward to 2020. Useful current forecasts for the 2020 election include PredictIt, OddsChecker, Betfairand PredictWise.

Taken together, they indicate that the Democratic candidate for the presidency will most likely come from a field including Senators Kamala Harris (the overall favourite), Bernie Sanders, Elizabeth Warren, Amy Klobuchar, Kirsten Gillibrand and Cory Booker. Outside the Senate, the frontrunners are former vice-president, Joe Biden, and the recent (unsuccessful) candidate for the Texas Senate, Beto O’Rourke.

Whoever prevails is most likely to face sitting president, Donald Trump, who is close to even money to face impeachment during his current term of office. If Trump isn’t the Republican nominee, the vice-president, Mike Pence, and former UN ambassador Nikki Haley are attracting the most support in the markets. The Democrats are currently about 57% to 43% favourites over the Republicans to win the presidency.

With the midterms over, our faith in political forecasting, at least in the US, has been somewhat restored. The focus now turns to 2020 – and whether they’ll accurately predict the next leader of the free world, or be left floundering by the unpredictable forces of a new world politics.

Is Schrodinger’s Cat Dead? Mystery in the Quantum World

Is it possible to be both alive and dead at the same time? This is the question central to the famous Schrödinger’s Cat thought experiment. In the version posed by Erwin Schrödinger, a cat is placed in an opaque box for an hour with a small piece of radioactive material which has an equal probability of decaying or not in that time period. If some radioactivity is detected by a Geiger counter also placed in the box, a relay releases a hammer which breaks a flask of hydrocyanic acid, killing the cat. If no radioactivity is detected, the cat lives. Before we open the box at the end of the hour, we estimate the chance that the radioactive material will decay and the cat will be dead at 50/50, the same as that it will be alive. Before we open the box, however, is the cat alive (and we don’t know it yet), dead (and we don’t know it yet) or both alive and dead (until we open the box and find out).

Common sense would seem to indicate that it is either alive or dead, but we don’t know until we open the box. Traditional quantum theory suggests otherwise. The cat is both alive, with a certain probability, and dead, with a certain probability, until we open the box and find out, when it has to become one or the other with a probability of 100 per cent. In quantum terminology, the cat is in a superposition (two states at the same time) of being alive and dead, which only collapses into one state (dead or alive) when the cat is observed. This might seem absurd when applied to a cat. After all surely it was either alive or dead before we opened the box and found out. It was simply that we didn’t know which. That may be true, when applied to cats. But when applied to the microscopic quantum world, such common sense goes out the window as a description of reality. For example, photons (the smallest measure of light) can exist simultaneously in both wave and particle states, and travel in both clockwise and anti-clockwise directions at the same time. Each state exists in the same moment. As soon as the photon is observed, however, it must settle on one unique state. In other words, the common sense that we can apply to cats we cannot apply to photons or other particles at the quantum level.

So what is going on? The traditional explanation as to why the same quantum particle can exist in different states simultaneously is known as the Copenhagen Interpretation. First proposed by Niels Bohr in the early twentieth century, the Copenhagen interpretation states that a quantum particle does not exist in any one state but in all possible states at the same time, with various probabilities. It is only when we observe it that it must in effect choose which of these states it exists as. At the sub-atomic level, then, particles seem to exist in a state of what is called ‘coherent superposition’, in which they can be two things at the same time, and only become one when they are forced to do so by the act of being observed. The total of all possible states is known as the ‘wave function.’ When the quantum particle is observed, the superposition ‘collapses’ and the object is forced into one of the states that make up its wave function.

The problem with this explanation is that all these different states exist. By observing the object, it might be that it reduces down to one of these states, but what has happened to the others? Where have they disappeared to?

This question lies at the heart of the so-called ‘Quantum Suicide’ thought experiment.

It goes like this. A man (not a cat) sits down in front of a gun which is linked to a machine that measures the spin of a quantum particle (a quark). If it is measured as spinning clockwise, the gun will fire and kill the man. If it is measured as spinning anti-clockwise, it will not fire and the man will survive to undergo the same experiment again.

The question is – will the man survive, and how long will he survive for? This thought experiment, proposed by Max Tegmark, has been answered in different ways by quantum theorists depending on whether or not they adhere to the Copenhagen Interpretation. In that interpretation, the gun will go off with a certain probability, depending on which way the quark is spinning. Eventually, by the laws of chance, the man will be killed, probably sooner rather than later. A growing number of theorists believe something else, however. They see both states (the particle is spinning clockwise and spinning anti-clockwise) as equally real, so there are two real outcomes. In one world, the man dies and in the other he lives. The experiment repeats, and the same split occurs. In one world there will exist a man who survives an indefinite number of rounds. In the other worlds, he is dead.

The difference between these alternative approaches is critical. The Copenhagen approach is to propose that the simultaneously existing states (for example, the quark that is spinning both clockwise and anti-clockwise simultaneously) exist in one world, and collapse into one of these states when observed. Meanwhile, the other states mysteriously disappear. The other approach is to posit that these simultaneously existing states are real states, and neither magically disappears, but branch off into different realities when observed. What is happening is that in one world, the particle is observed spinning clockwise (in the Quantum Suicide thought experiment, the man dies) and in the other world the particle is observed spinning the other way (and the man lives). Crucially, according to this interpretation both worlds are real. In other words, they are not notional states of one world but alternative realities. This is the so-called ‘Many Worlds Theory.’

Where is the burden of proof in trying to determine which interpretation of reality is correct? This depends on whether we take the one world that we can observe as the default position or the wave function of all possible states as represented in the mathematics of the wave function as the reality. Adherents to the Many Worlds position argue that the default is to go with what is described in the mathematics underpinning quantum theory – that the wave function represents all of reality. According to this argument, the minimal mathematical structure needed to make sense of quantum mechanics is the existence of many worlds which branch off, each of which contains an alternative reality. Moreover, these worlds are real. To say that our world, the one that we are observing, is the only real one, despite all the other possible worlds or measurement outcomes, has been likened to when we believed that the Earth was at the centre of the universe. There is no real justification, according to this interpretation, for saying that our branch of all possible states is the only real one, and that all other branches are non-existent or are ‘disappeared worlds.’ Put another way, the mathematics of quantum mechanics describes these different worlds. Nothing in the maths says that this world that we observe is more real than another world. So the burden of proof is on those who say it is. The viewpoint of the Copenhagen school is diametrically opposite. They argue that the hard evidence is of the world we are in, and the burden of proof is on those positing other worlds containing other branches of reality.

Depending on which default position we choose to adopt will determine whether we are adherents of the Copenhagen or the ‘Many Worlds’ schools.

For me personally, the logic of the argument points to the Many Worlds school. But to believe that they are right, and the Copenhagen school is wrong, seems kind of crazy, and totally counter-intuitive. In another world, of course, I’m probably saying the exact opposite.

Do We Live in a Simulation?

Do we live in a simulation, created by an advanced civilisation, in which we are part of some sophisticated virtual reality experience? For this to be a possibility we can make the obvious assumption that sufficiently advanced civilisations will possess the requisite computing and programming power to create what philosopher Nick Bostrom termed such ‘ancestor simulations’. These simulations would be complex enough for the minds that are simulated to be conscious and able to experience the type of experiences that we do. The creators of these simulations could exist at any stage in the development of the universe, even billions of years into the future.
The argument around simulation goes like this. One of the following three statements must be correct.
a. That civilisations at our level of development always or almost always disappear before becoming technologically advanced enough to create these simulations.
b. That the proportion of these technologically advanced civilisations that wish to create these simulations is zero or almost zero.
c. That we are almost sure to be living in such a simulation.
To see this, let’s examine each proposition in turn.
a. Suppose that the first is not true. In that case, a significant proportion of civilisations at our stage of technology go on to become technologically advanced enough to create these simulations.
b. Suppose that the second is not true. In this case, a significant proportion of these civilisations run such simulations.
c. If both of the above propositions are not true, then there will be countless simulated minds indistinguishable to all intents and purposes from ours, as there is potentially no limit to the number of simulations these civilisations could create. The number of such simulated minds would almost certainly be overwhelmingly greater than the number of minds that created them. Consequently, we would be quite safe in assuming that we are almost certainly inside a simulation created by some form of advanced civilisation.

For the first proposition to be untrue, civilisations must be able to go through the phase of being able to wipe themselves out, either deliberately or by accident, carelessness or neglect, and never or almost never do so. This might perhaps seem unlikely based on our experience of this world, but becomes more likely if we consider all other possible worlds.
For the second proposition to be untrue, we would have to assume that virtually all civilisations that were able to create these simulations would decide not to do so. This again is possible, but would seem unlikely.
If we consider both propositions, and we think it is unlikely that no civilisations survive long enough to achieve what Bostrom calls ‘technological maturity’, and that it is unlikely that hardly any would create ‘ancestor simulations’ if they could, then anyone considering the question is left with a stark conclusion. They really are living in a simulation.
To summarise. An advanced ‘technologically mature’ civilisation would have the capability of creating simulated minds. Based on this, at least one of three propositions must be true.
a. The proportion of these advanced civilisations is close to zero or zero.
b. The proportion of these advanced civilisations that wish to run these simulations is close to zero.
c. The proportion of those consciously considering the question who are living in a simulation is close to one.
If the first of these propositions is true, we will almost certainly not survive to become ‘technologically mature.’ If the second proposition is true, virtually no advanced civilisations are interested in using their power to create such simulations. If the third proposition is true, then conscious beings considering the question are almost certainly living in a simulation.
Through the veil of our ignorance, it might seem sensible to assign equal credence to all three, and to conclude that unless we are currently living in a simulation, descendants of this civilisation will almost certainly never be in a position to run these simulations.
Strangely indeed, the probability that we are living in a simulation increases as we draw closer to the point at which we are able and willing to do so. At the point that we would be ready to create our own simulations, we would paradoxically be at the very point when we were almost sure that we ourselves were simulations. Only by refraining to do so could we in a certain sense make it less likely that we were simulated, as it would show that at least one civilisation that was able to create simulations refrained from doing so. Once we took the plunge, we would know that we were almost certainly only doing so as simulated beings. And yet there must have been someone or something that created the first simulation. Could that be us, we would be asking ourselves? In our simulated hearts and minds, we would already know the answer!

Cracking the St. Petersburg Paradox

It was a puzzle first posed by the Swiss mathematician, Nicolas Bernoulli, in a letter to Pierre Raymond de Montmort, on Sept. 9, 1713, and published in the Commentaries of the Imperial Academy of Science of St. Petersburg. Mercifully it is simple to state. Less mercifully, it is supposedly a nightmare to solve. To state the paradox, imagine tossing a coin until it lands heads-up, and suppose that the payoff grows exponentially according to the number of tosses you make. If the coin lands heads-up on the first toss, then the payoff is £2. If it lands tails on the first toss, you receive £1. If it lands heads-up on the second toss, the payoff is £4; if it takes three tosses, the payoff is £8; and so forth, ad infinitum. You can play as many rounds of the game as you wish.

Now the odds of the game ending on the first toss is ½; of it ending on the second toss is (1/2)^2 = ¼; on the third, (1/2)^3 = 1/8, etc., so your expected win from playing the game = (1/2 x £1) + (1/2 x £2) + (1/4 x £4) + (1/8 x £8) + …, i.e. £0.5 + £1 + £1 + £1 … = infinity. It follows that you should be willing to pay any finite amount for the privilege of playing this game. Yet it seems irrational to pay very much at all.

According to this reasoning, any finite stake is justified because the eventual payout increases infinitely through time, so you must end up with a profit whenever the game ends. Yet most people are only willing to pay a few pounds, or at least not much more than this. So is this yet further evidence of our intuition letting us down?

That depends on why most people are not willing to pay much. There have been very many explanations proposed over the years, some more satisfying than others, but none has been universally accepted as getting near to a convincing explanation.

The best attempt, and one which I find the most convincing, is to address the issue of infinity. It is true, of course, that you will, if you play an infinite number of rounds of the game, win an infinite amount. But what happens in the real finite world? And here is the problem. Because playing to infinity pays an infinite amount, this does not mean that the game in finite time never stops paying out money. The key question in finite time is WHEN does the game turn profitable? The answer depends on the size of the stake per round. If this stake is £2, and you repeat the game over and over again, you are likely to make a lot of money very quickly. As the stake size increases, the number of rounds it takes to turn a profit becomes increasingly longer. Take the example of a stake of £4. In this case, you only make a profit if you throw three heads in a row, which is a 1 in 8 chance. You now need to factor in the losses you made in rounds where you didn’t throw three heads in a row. This extends the number of rounds it will take to turn a profit. So the game is not profitable at any stake size unless we are willing and able to play an infinite number of rounds. It is, in theoretical terms, profitable at any stake size, however large, but it will take forever to guarantee a profit. In a world of finite rounds and time scales, however, winnings generated by the game are easily countervailed by some specified level of stake size.

So what is the optimal stake size for playing the St. Petersburg game?

This depends on how many rounds you are willing to play and how likely you wish to be to come out ahead in that timescale.

This has been modelled empirically, using a computer program to calculate the outcome at different staking levels. What does it show? Well, if you stake a pound a round, you have a better than even chance of being in profit after just three rounds. If you pay £2 a round, the even-money chance of coming out ahead takes rather more rounds – about seven. At £3 a round we are looking at more than 20 rounds, at £4 approaching 100 rounds and £5 more than 300 rounds. By the time we are staking £10 a go, more than 350,000 rounds are needed to give you more than an even chance of being ahead of the game. An approximation that generates the 50-50 point to any staking level is 4 to the power of the stake, divided by 2.9. So what’s a reasonable spend per round to play the game? That depends on the person and the exact configuration of the game. Either way, it’s not that high.

Perhaps the median (the mean of the two middle values of the series), rather than the mean offers a pretty good approximation to the way most people think about this.

Let’s say that in the game as proposed, the game is run 1000 times. In this case, 500 of the values result in tails on the first toss with a return of £1. The next 25% of values result in tails on the second toss with a return of 2. The rest of the values are not then relevant. The 500th value is 1 and the 501st value is 2. The median is the mean of £1 and £2, i.e. £1.50.

Whichever of the two ways proposed here we look at it, the solution is much closer to most people’s intuitive answer than it is to the answer implied by the classic formulation of the St. Petersburg problem.

 

Reading

Koelman, J. Statistical Physics Attacks St. Petersburg: Paradox Resolved.

http://www.science20.com/hammock_physicist/statistical_physics_attacks_st_petersburg_paradox_resolved-96549

Fine, T.A. The Saint Petersburg Paradox is a Lie.

https://medium.com/@thomasafine/the-saint-petersburg-paradox-is-a-lie-62ed49aeca0b

Hayden, B.Y. and Platt, M.L. (2009), The mean, the median, and the St. Petersburg Paradox. Judgment and Decision Making, 4 (4), June, 256-272.

http://journal.sjdm.org/9226/jdm9226.html

Is the simpler explanation usually the better one?

William of Occam (also spelled William of Ockham) was a 14th century English philosopher. At the heart of Occam’s philosophy is the principle of simplicity, and Occam’s Razor has come to embody the method of eliminating unnecessary hypotheses. Essentially, Occam’s Razor holds that the theory which explains all (or the most) while assuming the least is the most likely to be correct. This is the principle of parsimony – explain more, assume less. Put more elegantly, it is the principle of ‘pluritas non est ponenda sine necessitate’ (plurality must never be posited beyond necessity).

Empirical support for the Razor can be drawn from the principle of ‘overfitting.’ In statistics, ‘overfitting’ occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. Critically, a model that has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data. For example, a complex polynomial function might after the fact be used to pass through each data point, including those generated by noise, but a linear function might be a better fit to the signal in the data. By this we mean that the linear function would predict new and unseen data points better than the polynomial function, although the polynomial which has been devised to capture signal and noise would describe/fit the existing data better.

We can also look at it through the lens of what is known as Solomonoff Induction. Whether a detective trying to solve a crime, a physicist trying to discover a new universal law, or an entrepreneur seeking to interpret some latest sales figures, all are involved in collecting information and trying to infer the underlying causes. The problem of induction is this: We have a set of observations (or data), and want to find the underlying causes of those observations, i.e. to find hypotheses that explain our data. We’d like to know which hypothesis is correct, so we can use that knowledge to predict future events. In doing so, we need to create a set of defined steps to arrive at the truth, a so-called algorithm for truth.

Ray Solomonoff’s algorithmic approach takes in data (observations) and outputs the rule by which the data was created. That is, it will give us an explanation of the observations; the causes. Suppose there are many hypotheses that could explain the data. All of the hypotheses are possible but some are more likely than others. How do you weight the various hypotheses? This depends on prior knowledge. But what if you have no prior knowledge of which hypothesis is likely to be better than another. This is where Occam’s Razor comes in. Solomonoff’s theory is one of prediction based on logical observations, such as predicting the next symbol based upon a given series of symbols. The only assumption that the theory makes is that the environment follows some unknown but computable probability distribution. It is as such a mathematical formalisation of Occam’s Razor.

All computable theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Shorter computable theories have more weight when calculating the expected reward to an action across all computable theories which perfectly describe previous observations. At any time, given the limited observation sequence so far, what is the optimal way of selecting the next action? The answer is to use Solomonoff’s method of calculating the prior probabilities in order to predict the probability of each possible future, and to execute the policy which, on a weighted average of all possible futures, maximises the predicted reward up to the horizon. This requires a way, however, of measuring the complexity of a theory.

Here we can turn to methods discovered to digitalise communication into a series of 0s and 1s. These series of bits of binary information can be termed strings, of a given length, say y, for a given language, the length of which will differ depending on the complexity of what is being communicated. This is where the idea of so-called Kolomogorov complexity, K(y), comes in. K(y) is the shortest possible description of string y for a given language. The upper bounds on the Kolomogorov complexity can be simple. Consider, for example, the two 32 character sequences:

abababababababababababababababab

4c1j5b2p0cv4w1x8rx2y39umgw5q85s7

The first can be written “ab 16 times”. The second probably cannot be simplified further.

Now consider the following inductive problem. A computer program outputs the following sequence of numbers: 1, 3, 5, 7.

What rule gives rise to the number sequence 1,3,5,7? If we know this, it will help us to predict what the next number in the sequence is likely to be, if there is one. Two hypotheses spring instantly to mind. It could be: 2n-1, where n is the step in the sequence. So the third step, for example, gives 2×3-1 = 5. If this is the correct rule generating the observations, the next step in the sequence will be 9 (5×2-1).

But it’s possible that the rule generating the number sequence is: 2n-1 + (n-1)(n-2)(n-3)(n-4). So the third step, for example, gives 2×3-1 + (3-1)(3-2)(3-3)(3-4) = 7. In this case, however, the next step in the sequence will be 33.

But doesn’t the first hypothesis seem more likely? Occam’s Razor is the principle behind this intuition. “Among all hypotheses consistent with the observations, the simplest is the most likely.” This sounds right, but can it be made more precise, and can it be justified? How do we find all consistent hypotheses, and how do we judge their simplicity?

Probability theory is the mathematics of reasoning with uncertainty. The keystone of this subject is Bayes’ Theorem. This tells you how likely something is given some other knowledge. Bayes’ Theorem can tell us how likely a hypothesis is, given evidence (or data, or observations). This is helpful because we want to know which model of the world is correct so that we can successfully predict the future.

It calculates this probability based on the prior probability of the hypothesis alone, the probability of the evidence alone, and the probability of the evidence given the hypothesis. It is just a matter of plugging the numbers in, although it is not always easy to identify these. But you can do your best. With enough evidence, it should become clear which hypothesis is correct. But guesses are not well-suited to an exact algorithm, so how can we construct this algorithm? Most situations in real life are complex, so that your “priors” (as used in Bayes’ Theorem) are actually probabilities that have been updated several times with past evidence.

But what would our ideal reasoning computer do before it knew anything? What would the probabilities be set to before we turned it on? How can we determine the probability of a hypothesis before seeing any data?

The answer is Occam’s Razor; simpler hypotheses more likely. But how rigorous is this? It’s usually difficult to find a measure of complexity, even for mathematical hypotheses. Is a normal curve simpler than an exponential curve, for example? Bayesian probability theory doesn’t have anything to say about choosing priors. Thus, the same probability is often assigned to each of the various hypotheses that might explain the observations. Of course this is a good approach if all the hypotheses actually are equally likely. But some hypotheses are more complex than others, and this makes them less likely than the other hypotheses. So when distributing your probability across several hypotheses, you shouldn’t necessarily distribute it evenly.

But we need a method that all can agree provides the correct priors in all situations. This helps us perform induction correctly and instils more honesty into the process. Since priors partly determine what people believe, they can sometimes choose priors that help “prove” what they want to prove, intentionally or unintentionally. To solve the problem of priors once and for all, we’d like to have an acceptable, universal prior distribution, so that there’s no vagueness in the process of induction. We need a recipe, an algorithm, for selecting our priors. For that we turn to the subject of binary sequences.

So if this is all the information we have, we have two different hypotheses about the rule generating the data. How do we decide which is more likely to be true? In general, when we have more than one hypothesis, each of which could be true, how can we decide which one actually is true? To start, is there a language in which we can express all problems, all data, all hypotheses? Let’s look at binary data. This is the name for representing information using only the characters ‘0’ and ‘1’. In a sense, binary is the simplest possible alphabet. With these two characters we can encode information. Each 0 or 1 in a binary sequence (e. g. 01001011) can be considered the answer to a yes-or-no question. And in principle, all information can be represented in binary sequences. Indeed, being able to do everything in the language of binary sequences simplifies things greatly, and gives us great power. We can treat everything contained in the data in the same way. Now, which of the three is more likely to be the true hypothesis that generated the data in the first place? How do we decide what the probability is of each of these hypotheses being true?

Now that we have a simple way to deal with all types of data, we need to look at the hypotheses, in particular how to assign prior probabilities to the hypotheses. When we encounter new data, we can then use Bayes’ Theorem to update these probabilities. To be complete, to guarantee we find the real explanation for our data, we have to consider all possible hypotheses. But how could we ever find all possible explanations for our data? By using the language of binary, we can do so.

Here we look to the concept of Solomonoff induction, in which the assumption we make about our data is that it was generated by some algorithm, i.e. the hypothesis that explains the data is an algorithm. Now we can find all the hypotheses that would predict the data we have observed. Given our data, we find potential hypotheses to explain it by running every hypothesis, one at a time. If the output matches our data, we keep it. Otherwise, we discard it. We now have a methodology, at least in theory, to examine the whole list of hypotheses that might be the true cause behind our observations.

Since they are algorithms, these hypotheses look like binary sequences. For example, the first few might be 01001101, 0011010110000110100100110, and 100011111011111110001110100101000001.

That is, for each of these three, when you give them as input, the output is our data. But which of the three is more likely to be the true hypothesis that generated the data in the first place? The first thing is to imagine that the true algorithm is produced in an unbiased way, by tossing a coin. For each bit of the hypothesis, we toss a coin. Heads will be 0, and tails will be 1. In the previous example, 01001101, the coin landed heads, tails, heads, tails and so on. Because each toss of the coin has a 50% probability, each bit contributes ½ to the final probability. Therefore, an algorithm that is one bit longer is half as likely to be the true algorithm. This intuitively fits with Occam’s Razor: a hypothesis that is 8 bits long is much more likely than a hypothesis that is 34 bits long. Why bother with extra bits? We’d need evidence to show that they were necessary. So why not take the shortest hypothesis and call that the truth? Because all of the hypotheses predict the data we have so far, and in the future we might get data to rule out the shortest one. The more data we get, the easier it is likely to become to pare down the number of competing hypotheses which fit the data.

With the data we have, we keep all consistent hypotheses, but weight the shorter ones with higher probability. So in our eight-bit example, the probability of 01001101 being the true algorithm is 1/256, although this isn’t a probability in the normal sense, since the sum of the probabilities have not been normalised to add to one. But these probabilities can still be used to compare how likely different hypotheses are. Solomonoff induction is the process that describes the scientific method made into an algorithm.

To summarize, Solomonoff induction works by starting with all possible hypotheses (sequences) as represented by computer programs (that generate those sequences), weighted by their simplicity (2n, where n is the program length), and discarding those hypotheses that are inconsistent with the data. Weighting hypotheses by simplicity, the system automatically incorporates a form of Occam’s Razor.

Turning now to ‘ad hoc’ hypotheses and the Razor. In science and philosophy, an ‘ad hoc hypothesis’ is a hypothesis added to a theory in order to save it from being falsified. Ad hoc hypothesising is compensating for anomalies not anticipated by the theory in its unmodified form. For example, you say that there is a leprechaun in your garden shed. A visitor to the shed sees no leprechaun. This is because he is invisible, you say. He spreads flour on the ground to see the footprints. He floats, you declare. He wants you to ask him to speak. He has no voice, you say. More generally, for each accepted explanation of a phenomenon, there is generally an infinite number of possible, more complex alternatives. Each true explanation may therefore have had many alternatives that were simpler and false, but also approaching an infinite number of alternatives that are more complex and false.

This leads us the idea of what I term ‘Occam’s Leprechaun.’ Any new and more complex theory can always be possibly true. For example, if an individual claims that leprechauns were responsible for breaking a vase that he is suspected of breaking, the simpler explanation is that he is not telling the truth, but ongoing ad hoc explanations (e.g. “That’s not me on the CCTV, it’s a leprechaun disguised as me) prevent outright falsification. An endless supply of elaborate competing explanations, called ‘saving hypotheses’, prevent ultimate falsification of the leprechaun hypothesis, but appeal to Occam’s Razor helps steer us toward the probable truth. Another way of looking at this is that simpler theories are more easily falsifiable, and hence possess more empirical content.

All assumptions introduce possibilities for error; if an assumption does not improve the accuracy of a theory, its only effect is to increase the probability that the overall theory is wrong. It can also be looked at this way. The prior probability that a theory based on n+1 assumptions is true must be less than a theory based on n assumptions, unless the additional assumption is a consequence of the previous assumptions. For example, the prior probability that Jack is a train driver must be less than the prior probability that Jack is a train driver AND that he owns a Mini Cooper, unless all train drivers own Mini Coopers, in which case the prior probabilities are identical.

Again, the prior probability that Jack is a train driver and a Mini Cooper owner and a ballet dancer is less than the prior probability that he is just the first two, unless all train drivers are not only Mini Cooper owners but also ballet dancers. In the latter case, the prior probabilities of the n and n+1 assumptions are the same.

From Bayes’ Theorem, we know that reducing the prior probability will reduce the posterior probability, i.e. the probability that a proposition is true after new evidence arises. Science prefers the simplest explanation that is consistent with the data available at a given time, but even so the simplest explanation may be ruled out as new data become available. This does not invalidate the Razor, which does not state that simpler theories are necessarily more true than more complex theories, but that when more than one theory explains the same data, the simpler should be accorded more probabilistic weight. The theory which explains all (or the most) and assumes the least is most likely. So Occam’s Razor advises us to keep explanations simple. But it is also consistent with multiplying entities necessary to explain a phenomenon. A simpler explanation which fails to explain as much as another more complex explanation is not necessarily the better one. So if leprechauns don’t explain anything they cannot be used as proxies for something else which can explain something.

More generally, we can now unify Epicurus and Occam. From Epicurus’ Principle we need to keep open all hypotheses consistent with the known evidence which are true with a probability of more than zero. From Occam’s Razor we prefer from among all hypotheses that are consistent with the known evidence, the simplest. In terms of a prior distribution over hypotheses, this is the same as giving simpler hypotheses higher ‘a priori’ probability, and more complex ones lower probability.

From here we can move to the wider problem of induction about the unknown by extrapolating a pattern from the known. Specifically, the problem of induction is how we can justify inductive inference. According to Hume’s ‘Enquiry Concerning Human Understanding’ (1748), if we justify induction on the basis that it has worked in the past, then we have to use induction to justify why it will continue to work in the future. This is circular reasoning. This is faulty theory. “Induction is just a mental habit, and necessity is something in the mind and not in the events.” Yet in practice we cannot help but rely on induction. We are working from the idea that it works in practice if not in theory – so far. Induction is thus related to an assumption about the uniformity of nature. Of course, induction can be turned into deduction by adding principles about the world (such as ‘the future resembles the past’, or ‘space-time is homogeneous.’) We can also assign to inductive generalisations probabilities that increase as the generalisations are supported by more and more independent events. This is the Bayesian approach, and it is a response to the perspective pioneered by Karl Popper. From the Popperian perspective, a single observational event may prove hypotheses wrong, but no finite sequence of events can verify them correct. Induction is from this perspective theoretically unjustifiable and becomes in practice the choice of the simplest generalisation that resists falsification. The simpler a hypothesis, the easier it is to be falsified. Induction and falsifiability are in practice, from this viewpoint, as good as it gets in science. Take an inductive inference problem where there is some observed data and a set of hypotheses, one of which may be the true hypothesis generating the data. The task then is to decide which hypothesis, or hypotheses, are the most likely to be responsible for the observations.

A better way of looking at this seems to be to abandon certainties and think probabilistically. Entropy is the tendency of isolated systems to move toward disorder and a quantification of that disorder, e.g. assembling a deck of cards in a defined order requires introducing some energy to the system. If you drop the deck, they become disorganised and won’t re-organise themselves automatically. This is the tendency in all systems to disorder. This is the Second Law of Thermodynamics, which implies that time is asymmetrical with respect to the amount of order: as the system, advances through time, it will statistically become more disordered. By ‘Order’ and ‘Disorder’ we mean how compressed the information is that is describing the system. So if all your papers are in one neat pile, then the description is “All paper in one neat pile.” If you drop them, the description becomes ‘One paper to the right, another to the left, one above, one below, etc. etc.” The longer the description, the higher the entropy. According to Occam’s Razor, we want a theory with low entropy, i.e. low disorder, high simplicity. The lower the entropy, the more likely it is that the theory is the true explanation of the data, and hence that theory should be assigned a higher probability.

More generally, whatever theory we develop, say to explain the origin of the universe, or consciousness, or non-material morality, must itself be based on some theory, which is based on some other theory, and so on. At some point we need to rely on some statement which is true but not provable, and so we think may be false, although it is actually true. We can never solve the ultimate problem of induction, but Occam’s Razor combined with Epicurus, Bayes and Popper is as good as it gets if we accept that. So Epicurus, Occam, Bayes and Popper help us pose the right questions, and help us to establish a good framework for thinking about the answers.

At least that applies to the realm of established scientific enquiry and the pursuit of scientific truth. How far it can properly be extended beyond that is a subject of intense and continuing debate.

Forecasting Elections and Other Things – Where did it all go wrong?

There are a number of ways that have been used over the years to forecast the outcome of elections. These include betting markets, opinion polls, expert analysis, crystal balls, tea leaves, Tarot cards and astrology! Let’s start by looking at the historical performance of betting markets in forecasting elections.

The recorded history of election betting markets can be traced as far back as 1868 for US presidential elections and 1503 for papal conclaves. In both years, the betting favourite won (Ulysses S. Grant, 1868 elected President; 1503 Cardinal Francesco Piccolomini elected Pope Pius III). From 1868 up to 2016, no clear favourite for the White House had lost the presidential election other than in 1948, when longshot Harry Truman defeated his Republican rival, Thomas Dewey. The record of the betting markets in predicting the outcome of papal conclaves since 1503 is less complete, however, and a little more chequered. The potential of the betting markets and prediction markets (markets created to provide forecasts) to assimilate collective knowledge and wisdom has increased in recent years as the volume of money wagered and number of market participants has soared. Betting exchanges (where people offer and take bets directly, person-to-person) now see tens of millions of pounds trading on a single election. An argument made for the value of betting markets in predicting the probable outcome of elections is that the collective wisdom of many people is greater than that of the few. We might also expect that those who know more, and are better able to process the available information, would on average tend to bet more. Moreover, the lower the transactions costs of betting and the lower the cost of accessing and processing information, the more efficient we might expect betting markets to become in translating information into forecasts. In fact, the betting public have not paid tax on their bets in the UK since 2001, and margins have fallen significantly since the advent of person-to-person betting exchanges which cut out the middleman bookmaker. Information costs have also plummeted as we have witnessed the development of the Internet and search engines. Modern betting markets might be expected for these reasons to provide better forecasts than ever.

There is indeed plenty of solid anecdotal evidence about the accuracy of betting markets, especially compared to the opinion polls. The 1985 by-election for the vacant parliamentary seat of Brecon and Radnor in Wales offers a classic example. Mori, the polling organisation, had the Labour candidate on the eve of poll leading by a massive 18%, while Ladbrokes, the bookmaker, simultaneously quoted the Liberal Alliance candidate as odds-on 4/7 favourite. When the result was declared, there were  two winners – the Liberal candidate and the bookmaker.

In the 2000 US presidential election, IG Index, the spread betting company, offered a spread on the day of 265 to 275 electoral college votes about both Bush and Gore. Meanwhile, Rasmussen, the polling company, had Bush leading Gore by 9% in the popular vote. In the event, the electoral college (courtesy of a controversial US Supreme Court judgment) split 271 to 266 in favour of Bush, both within the quoted spreads. Gore also won the popular vote, putting the pollster out by almost 10 percentage points.

In the 2004 US presidential election, the polls were mixed. Fox had Kerry up by 2 per cent, for example, while GW/Battleground had Bush up 4. There was no consensus nationally, much less state by state. Meanwhile, the favourite on the Intrade prediction market for each state won every single one of those states.

In 2005, I was asked on to a BBC World Service live radio debate in the immediate run-up to the UK general election, where I swapped forecasts with Sir Robert Worcester, Head of the Mori polling organisation. I predicted a Labour majority of about 60, as I had done a few days earlier in the Economist magazine and on BBC Radio 4 Today, based on the betting at the time. Mori had Labour on a projected majority of over 100 based on their polling. The majority was 66.

In the 2008 US presidential election, the Betfair exchange market’s state-by-state predictions called 49 out of 50 states correctly. Only Indiana was called wrong.  While the betting markets always had Obama as firm favourite, the polls had shown different candidates winning at different times in the run-up to the election. On polling day, Obama was as short as 1 to 20 to win on the betting exchanges, but some polls still had it well within the margin of error. He won by 7.2%. By 365 Electoral Votes to 173.

In the 2012 US presidential election, the RealClearPolitics average of national polls on election day showed Obama and Romney essentially tied. Gallup and Rasmussen had Romney leading, others had Obama narrowly ahead. To be precise, the average of all polls had Obama up 0.7%. Obama won by 4% and by 332 electoral votes to 206.

In the week running up to polling day in the 2014 Scottish referendum, polls had No to independence with leads of between 1% (Panelbase and TNS BMRB) to, at the very top end, Survation (7%), and YES to independence with leads of between 2% (YouGov) and 7% (ICM/Sunday Telegraph). The final polls had No to independence between 2% and 5% ahead. The actual result was No by 10.6%. The result had been reflected in the betting markets throughout, with No to independence always a short odds-on favourite. To give an example of the general bookmaker prices, one client of William Hill staked a total of £900,000 to win £193,000 which works out at an average price of about 1 to 5.

In the 2015 Irish referendum on same-sex marriage, the final polls broke down as a vote share of 70% for Yes to 30% for No. In the spread betting markets, the vote share was being quoted with mid-points of 60% Yes  and 40% No. The final result was 62% Yes, 38% No, almost exactly in line with the betting markets.

In the Israeli election of 2015, the final polls showed Netanyahu’s Likud party trailing the main opposition party by 4% (Cannel 2, Channel 10, Jerusalem Post, by 2% (Teleseker/Walla) and by 3% (Channel 1). Meanwhile, Israel’s Channel 2 television news on election day featured the betting odds on the online prediction market service, Predictwise. PredictWise had Netanyahu as 80% favourite. The next day, Netanyahu declared that he won “against the odds.” In fact, he did not. He won against the polls.

In the 2015 UK general election, the polling averages throughout the campaign had the Conservatives and Labour neck and neck, within a percentage point or so of each other. Meanwhile, the betting odds always had Tory most seats at very short odds-on. To compare at a point in time, three days before polling, the polling average had it tied. Simultaneously, Conservatives most seats was trading on the markets as short as 1 to 6.

If this anecdotal evidence is correct, it is natural to ask why the betting markets outperform the opinion polls in terms of forecasting accuracy. One obvious reason is that there is an asymmetry. People who bet in significant sums on an election outcome will usually have access to the polling evidence, while opinion polls do not take account of information contained in the betting odds (though the opinions expressed might, if voters are influenced by the betting odds). Sophisticated political bettors also take account of how good different pollsters are, what tends to happen to those who are undecided when they actually vote, differential turnout of voters, what might drive the agenda between the dates of the polling surveys and election day itself, and so on. All of this can in principle be captured in the markets.

Pollsters, except perhaps with their final polls (and sometimes even then) tend to claim that they are not producing a forecast, but a snapshot of opinion. This is the classic ‘snapshot defence’ wheeled out by the pollsters when things go badly wrong. In contrast, the betting markets are generating odds about the final result, so can’t use this questionable defence. In any case, polls are used by those trading the markets to improve their forecasts, so they are (or should be) a valuable input. But they are only one input. Those betting in the markets have access to much other information as well including, for example, informed political analysis, statistical modelling, focus groups and on-the-ground information including local canvass returns.

Does Big Data back up the anecdotal evidence? To test the reliability of the anecdotal evidence pointing to the superior forecasting performance of the betting markets over the polls, we collected vast data sets for a paper published in the Journal of Forecasting (‘Forecasting Elections’, 2016, by Vaughan Williams and Reade) of every matched contract placed on two leading betting exchanges and from a dedicated prediction market for US elections, since 2000. This was collected over 900 days before the 2008 election alone, and to indicate the size, a single data set was made up of 411,858 observations from one exchange alone for that year. Data was derived notably from presidential elections at national and state level, Senate elections, House elections, and elections for Governor and Mayor. Democrat and Republican selection primaries were also included. Information was collected on the polling company, the length of time over which the poll was conducted, and the type of poll. The betting was compared over the entire period with the opinion polls published over that period, and also with expert opinion and a statistical model. In this paper, as well as in Vaughan Williams and Reade – ‘Polls and Probabilities: Prediction Markets and Opinion Polls’, we specifically assessed opinion polls, prediction and betting markets, expert opinion and statistical modelling over this vast data set of elections in order to determine which performed better in term of forecasting outcomes. We considered accuracy, bias and precision over different time horizons before an election.

A very simple measure of accuracy is the percentage of correct forecasts, i.e. how often a forecast correctly predicts the election outcome. We also identified the precision of the forecasts, which relates to the spread of the forecasts. A related but distinctly different concept to accuracy is unbiasedness. An unbiased probability forecast is also, on average, equal to the probability that the candidate wins the election. Forecasts that are accurate can also be biased, provided the bias is in the correct direction. If polls are consistently upward biased for candidates that eventually win, then despite being biased they will be vey accurate in predicting the outcome, whereas polls that are consistently downward biased for candidates that eventually win will be very inaccurate as well as biased.

We considered accuracy, precision and bias over different time horizons before an election. We found that the betting/prediction market forecasts provided the most accurate and precise forecasts and were similar in terms of bias to opinion polls. We found that betting/prediction market forecasts also tended to improve as the elections approached, while we found evidence of opinion polls tending to perform worse.

In summary, we concluded that betting and prediction markets provide the most accurate and precise forecasts. We noted that forecast horizon matters: whereas betting/prediction market forecasts tend to improve nearer an election, opinion polls tend to perform worse, while expert opinion performs consistently throughout, though not as well as betting markets. There was also a systematic small bias against favourites, so that most likely outcome is actually usually a little more likely than suggested in the odds. Finally, if the polls and betting markets say different things, it is normally advisable to look to the betting markets.

So let’s turn again to why might we expect the betting markets to beat the polls. Most fundamentally, opinion polls, like all market research, provide a valuable source of information, but they are only one source of information, and some polls have historically been more accurate than others. Traders in the markets consider such things as what tends to happen to ‘undecideds’. Is there a late swing to incumbents or ‘status quo’? What is the likely impact of late endorsements by press or potential late announcements? Late on-the-day ‘tabloid press effect’, esp. on emotions. Influences undecideds, drives turnout to chosen editorial line. What is the likely turnout? What is the impact of differential turnout. Finally, sophisticated bettors take account of the relative accuracy of different polls and look behind the headline results to the detailed breakdown and the methodology used the poll. Betting markets should aggregate all the available information and analysis.

Moreover, people who know the most, and are best able to process the information, will tend to bet the most, but people who know only a little tend to bet only a little. The more money involved, or the greater the incentives, the more efficient and accurate will the market tend to be. It really is in this sense a case of “follow the money”.

Sometimes it is even possible to follow the money all the way to the future. To capture tomorrow’s news today. A classic example is the ‘Will Saddam Hussein be captured or neutralised by the end of the month’ Intrade exchange market? Early on 13 December, 2003, the market moved from 20 (per cent chance) to 100. The capture was announced early on 14 December, 2003, and officially took place at 20:30 hours Iraqi time, several hours after the Intrade market moved to 100. I call these, with due deference to Star Trek,  ‘Warp speed markets’.

But we need to be cautious. With rare exceptions, betting markets don’t tell us what the future will be. They tell us at best what the probable future will be. They are, in general, not a crystal ball. And we need to be very aware of this. Even so, the overwhelming consensus of evidence prior to the 2015 UK General Election pointed to the success of political betting markets in predicting the outcome of elections.

And then the tide turned.

The 2016 EU referendum in the UK (Brexit), the 2016 US presidential election (Trump) and the 2017 UK General Election (No overall majority) produced results that were a shock to the great majority of pollsters as well as to the betting markets. The turning of the tide could be traced, however, to the Conservative overall majority in 2015, which came as a shock to the markets and pollsters alike. After broadly 150 years of unparalleled success for the betting markets, questions were being asked. The polls were equally unsuccessful, as were most expert analysts and statistical models.

The Meltdown could be summarised in two short words. Brexit and Trump. Both broadly unforeseen by the pollsters, pundits, political scientists or prediction markets. But two big events in need of a big explanation. So where did it all go wrong?  There are various theories to explain why the markets broke down in these recent big votes.

Theory 1: The simple laws of probability. An 80% favourite can be expected to lose one time in five, if the odds are correct. In the long run, according to this explanation, things should balance out. It’s like there are five parallel universes. The UK on four of the parallel universes votes to Remain in the EU, but not in the fifth.Hillary Clinton wins in four of the parallel universes but not in the fifth. In other words, it’s just chance, no more strange than a racehorse starting at 4/1 winning the race. But for that to be a convincing explanation, it would need to assume that 2015 election, Brexit, Trump and 2017 election were totally correlated. Even if there is some correlation of outcome, the markets were aware of each of the predictive failures in the previous votes and still favoured the losing outcome by a factor of 4 or 5 to 1. That means we can multiply the probabilities. 1/5×1/5×1/5×1/5 = 1/625.   1/6×1/6×1/6×1/6 = 1/1296. Either way, its starting to look unlikely.

Theory 2: A second theory to explain recent surprise results is that something fundamental has changed in the way that information contained in political betting markets is perceived and processed. One interpretation is that the hitherto widespread success of the betting markets in forecasting election outcomes, and the publicity that was given to this, turned them into an accepted measure of the state of a race, creating a perception which was difficult to shift in response to new information. This is a form of ‘anchoring’. To this extent, market prices to some extent led opinion rather than simply reflecting it. From this perspective, the prices in the markets became a yardstick of the true probabilities and thus somewhat inflexible in response to the weight of new information.This leads to the herding hypothesis. Because the prediction markets had by 2015 become so firmly entrenched in conventional wisdom as an accurate forecasting tool, people herded around the forecasts, propelling the implied probabilities of existing forecasts upwards. So a 55% probability of victory, for example, became transformed into something much higher. In consequence, a prediction market implied probability of 70%, say, might be properly adjusted to a true probability of, say, 55%. In principle, it is possible to de-bias (or de-herd) each prediction market probability into a more accurate adjusted probability. We also need to look at the idea of self-reinforcing feedback loops. City traders look to the betting exchanges and the fixed-odds and spread bookmakers’ odds for evidence of what is the true state of play in each race. That influences the futures markets, which in turn influences perceptions among bettors. A sort of prediction market loop, in which expectations become self-reinforcing. This is a form of ‘groupthink’ in which those trading the futures and prediction markets are taking the position they are simply because others are doing so. This is further reinforced by the key arbitrating divide which more than anything acts as a distinguishing marker between Brexit supporters and Remain supporters, between Trump voters and Hillary Clinton voters – educational level. More than any other factor, it is the ‘University education’ marker that identifies the Remain voter, the Clinton voter. Also, the vast majority of City traders as well as betting exchange traders are University-educated, and tend to mix with similar, which may have reinforced the perception that Trump and Brexit were losing tickets. Indeed, more than ever before, as the volume of information increases, and people’s ability to sort between and navigate and share these information sources increases, there is a growing disjoint between the information being seen and processed by different population silos. This is making it increasingly difficult for those inhabiting these different information universes to make any sense of what is driving the preferences of those in alternative information universes, and therefore engaging with them and forming accurate expectations of their likely voting behaviour and likelihood of voting. The divide is increasingly linked to age and educational profile, reducing the diversity of opinion which is conventionally critical in driving the crowd wisdom aspect of prediction markets. It also helps explain the broad cluelessness of the political and political commentating classes in understanding and forecasting these event outcomes. Of course, the pollsters, pundits, political scientists and politicians were broadly speaking just as clueless. So why?

Theory 3: Conventional patterns of voting broke down in 2015 and subsequently, primarily due to unprecedented differential voter turnout patterns across key demographics, which were not correctly modelled in most of the polling and which were missed by political pundits, political scientists, politicians and those trading the betting markets. In particular, there was unprecedented turnout in favour of Brexit and Trump by demographics that usually voted in relatively low numbers, notably the more educationally disadvantaged sections of society. And this may be linked to a breakdown of the conventional political wisdom. This wisdom holds that campaigns don’t matter, that swings of support between parties are broadly similar across the country, that elections can only be won from the centre, and that the so-called ‘Overton window’ must be observed. This idea, conceived by political scientist Joseph Overton, is that for any political issue there’s a range of socially acceptable and broadly tolerated positions (the ‘Overton window’) that’s narrower than the range of possible positions. It’s an idea which in a Brexit/Trump age seems to have gone very much out of the window.

Theory 4: Manipulation. Robin Hanson and Ryan Oprea co-authored a paper titled, ‘A Manipulator Can Aid Prediction Market Accuracy‘, in a special issue of Economica in 2009 which I co-edited. Manipulation can actually improve prediction markets, they argue, for the simple reason that manipulation offers informed investors a proverbial ‘free lunch.’ In a stock market, a manipulator sells and buys based on reasons other than expectations and so offers other investors a greater than normal return. The more manipulation, therefore, the greater the expected profit from betting. For this reason, investors should soon move to take advantage of any price discrepancies thus created within and between markets, as well as to take advantage of any perceived mispricing relative to fundamentals. Thus the expected value of the trading is a loss for the manipulator and a profit for the investors who exploit the mispricing. Manipulation creates liquidity, which draws in informed investors and provides the incentive to acquire and process further information, which makes the market ever more efficient.

Theory 5: Fake News. There are other theories, which may be linked to the demographic turnout theory, including notably the impact of misinformation (fake news stories), of hacked campaign email accounts, and direct manipulation of social media accounts. In fact, we know when it all started to go wrong. That was 7th May, 2015, when the Conservatives won an unforeseen overall majority in the General Election. That result led to Brexit. That in turn arguably helped propel Trump to power. And it led to the shock 2017 UK election result. Common to all these unexpected outcomes is the existence of a post-truth misinformation age of ‘fake news’ and the potential to exploit our exposure to social media platforms by those with the money, power and motivation to do so. The weaponisation of fake news might explain the breakdown in the forecasting power of the betting markets and pollsters, commencing in 2015, as well as the breakdown of the traditional forecasting methodologies in predicting Brexit and Trump. This has in large part been driven by the power of fake news distribution and the targeting of such via social media platforms, to alter traditional demographic turnout patterns. This is by boosting turnout among certain demographics and suppressing it among others. The weaponisation of fake news by the tabloid press is of course nothing new but it has become increasingly virulent and sophisticated and its online presence amplifies its reach and influence. The weaponisation of fake news by the tabloid press can also help explain on-the-day shifts in turnout patterns.

What it does not explain is some very odd happenings in recent times. Besides Brexit and Trump, Leicester City became 5.000/1 winners of the English Premier League. The makers and cast of La La Land had accepted the Oscar for Best Picture before it was snatched away in front of billions to be handed to Moonlight. This only echoed the exact same thing happening to Miss Venezuela when her Miss Universe crown was snatched away after her ceremonial walk to be awarded to Miss Philippines.  And did the Atlanta Falcons really lose the SuperBowl after building an unassailable lead? And did the BBC Sports Personality of the Year Award go to someone whose chance of winning was so small he didn’t even turn up to the ceremony, while the 1/10 favourite was beaten by a little-known motorcyclist and didn’t even make the podium.  Which leads us to Theory 6.

Theory 6: We live in a simulation. In the words of a New Yorker columnist in February 2017: “Whether we are at the mercy of an omniscient adolescent prankster or suddenly the subjects of a more harrowing experiment than any we have been subject to before … we can now expect nothing remotely normal to take place for a long time to come. They’re fiddling with our knobs, and nobody knows the end.”

So maybe the aliens are in control in which case all bets are off. Or have we simply been buffeted as never before by media manipulation and fake news? Or is it something else? Whatever the truth, we seem to be at the cusp of a new age. We know not yet which way that will lead us. Hopefully, the choice is still in our hands.