When Should We Expect to be Kicked by a Horse?
Exploring the Poisson Distribution
A version of this article appears in TWISTED LOGIC: Puzzles, Paradoxes, and Big Questions. By Leighton Vaughan Williams. Chapman & Hall/CRC Press. 2024.
A STATISTICAL TOOL
Thanks for reading Twisted Logic! Subscribe for free to receive new posts and support my work.
Subscribed
The Poisson distribution, inspired by the work of Siméon Denis Poisson, is a statistical concept that is particularly useful for helping us understand events that occur infrequently. It indicates the number of such events we can expect to occur in a fixed interval if we know the average rate at which they arrive. In simpler terms, if you want to predict how often something will happen over a certain period, and this event is infrequent, the Poisson distribution can be your go-to method for making this prediction.
This distribution finds practical applications in various fields, ranging from studying historical events to analysing everyday situations and even sports.
UNDERLYING ASSUMPTIONS OF THE POISSON DISTRIBUTION
The accuracy and applicability of the Poisson distribution hinge on several key assumptions:
Independence of Events: Each event must occur independently of the others. This means the occurrence of one event does not affect the probability of another event occurring.
Constant Average Rate: The events are expected to occur at a constant average rate. In other words, the average number of events per unit of time or space remains consistent throughout the period being considered.
Random Occurrence: The events occur randomly, without any predictable pattern or structure. This randomness is crucial for the Poisson model to provide accurate predictions.
Discrete Events: The events are distinct and countable. For instance, the number of emails received per day or the number of accidents at a particular intersection per month.
Understanding these assumptions is vital for correctly applying the Poisson distribution. It is most effective in situations where these conditions are met, such as modelling the number of meteor showers observed in a year, counting the number of times a rare bird is spotted in a forest, or predicting the number of cars passing through a toll booth in an hour.
It’s also very useful in predicting how likely you are to be kicked by a horse next week! The next section explains.
PREDICTING RARE EVENTS: PRUSSIAN CAVALRY OFFICER DEATHS
Let’s travel back in time to the 19th century, when the Poisson distribution was used to study a particular historical event. During this period, researchers were interested in understanding the number of Prussian cavalry officers who were kicked to death by horses in different Army regiments over a span of 20 years. This unfortunate occurrence was relatively rare, but was it random, or were there some underlying factors influencing their occurrence?
Enter Ladislaus Bortkiewicz, an economist and statistician. Bortkiewicz collected data from 14 corps over 20 years, which resulted in observations of yearly numbers of deaths per corps. Using the formula associated with the Poisson distribution, he was able to predict the number of such deaths in specific time intervals. These fitted quite closely to the observed data, indicating that the deaths were indeed random events, and nothing more mysterious or sinister.
This application of the Poisson distribution became a textbook example of real-world events that can be modelled as Poisson processes, which include radioactive decay, arrival of emails, number of phone calls received by a call centre, etc. The deaths of Prussian cavalry officers are an early example of a statistical study in the field of survival analysis.
WORLD WAR II BOMBING RAIDS
During the Second World War, a British statistician named R.D. Clarke used this method to study where the new V-1 ‘flying bombs’ were falling in London. He wanted to figure out if the German military was successfully targeting specific areas or if the bombs were falling randomly. This was strategically important information. It was clear that the V-1s sometimes fell in clusters. The question was whether this could be expected from random chance or whether precision guidance was at play.
To find out, Clarke divided London into thousands of small, equal-sized areas. He assumed to start with that each area had the same small chance of being hit by a bomb. This situation was similar to playing a game many times where you ‘win’ only infrequently. Clarke’s calculations showed that the number of bomb hits in each area matched what the Poisson distribution predicted for random hits. This meant that where the bombs fell seemed to be a product of chance, not because specific areas were targeted.
FROM HISTORY TO FOOTBALL: PREDICTING GOAL SCORING
In football, goals are a relatively infrequent event within the setting of a match, and so are suitable for the application of the Poisson distribution. This provides a simple and effective tool to examine and predict the likely incidence of goals in a match, based on historical data and average goal rates.
Consider, say, a match between two teams, one with an average goal rate of 1.6 goals per game and the other with an average goal rate of 1.2 goals per game. The Poisson distribution allows us to calculate the probabilities of various goal-scoring outcomes for this specific match.
For example, by examining the historical data and applying the Poisson distribution, analysts can estimate the probability of a goalless draw, a 1-1 draw, a win for either team, or any other scoreline based on the average goal rates of the teams involved.
More generally, the Poisson formula allows us to calculate the chance of observing a specific number of events of this kind when we know how often they usually occur on average. It considers the average rate and calculates the probability of obtaining the specific number we’re interested in.
REAL-WORLD APPLICATIONS
The practical applications of the Poisson distribution extend far beyond historical events and sports analytics. This versatile statistical concept finds relevance in a wide range of modern real-world scenarios, helping us understand and analyse various phenomena. Let’s explore some of its notable applications.
Homes Sold and Business Planning
Imagine you are a local estate agent. Understanding the number of homes you are likely to sell in a given time period is crucial for business planning and forecasting. The Poisson distribution provides a framework for estimating the probability of selling a specific number of homes per day, week, or any other timeframe based on historical data and average sales rates. This information helps in making informed decisions about marketing strategies, staffing, and resource allocation.
Disease Spread and Epidemiology
In the field of epidemiology, the Poisson distribution plays a vital role in understanding the spread of infectious diseases. By analysing historical data and considering the average rate of infection, researchers can utilise the Poisson distribution to estimate the likelihood of disease outbreaks and their progression.
Telecommunications and Network Traffic
The Poisson distribution finds application in the analysis of telecommunications systems and network traffic. By studying the arrival patterns of these events using the Poisson distribution, companies can anticipate network demand, allocate resources effectively, and ensure smooth and reliable communication services.
Quality Control and Manufacturing Processes
The Poisson distribution is also used in quality control, particularly in manufacturing settings. By analysing the number of defective products using the Poisson distribution, manufacturers can estimate the probability of observing a specific number of defects. This information helps them identify areas for improvement and enhance overall product quality.
Traffic Accidents and Road Safety
Another area where the Poisson distribution finds application is in analysing traffic accidents and road safety. By examining historical data on accidents, researchers can use the Poisson distribution to model accident rates based on factors such as location, time of day, and road conditions. This understanding helps in the development of targeted interventions to reduce accidents and improve road safety.
CONCLUSION: A POWERFUL TOOL FOR INFREQUENT EVENTS
The Poisson distribution is a valuable statistical tool that helps us understand and analyse events that happen infrequently but have an average rate of occurrence. It may seem complicated at first, but it allows us to make predictions and informed decisions based on probabilities. By using the principles of the Poisson distribution, we can gain insights into rare events and use that knowledge to improve various aspects of our lives.
