Beautiful People Are Difficult – or are they?
Exploring Berkson’s Paradox
A version of this article appears in my book, Twisted Logic: Puzzles, Paradoxes, and Big Questions (Chapman & Hall/CRC Press).
BERKSON’S PARADOX
Berkson’s Paradox, also known as Collider bias, is a statistical phenomenon where unrelated factors appear linked due to selective sampling. It’s a common pitfall in data analysis that can lead to misleading conclusions.
AN ILLUSTRATIVE EXAMPLE: COLLEGE ADMISSION
To better understand this statistical phenomenon, consider an example involving a prestigious college. This institution only accepts students who demonstrate excellence in either music or sports.
Now, in the wide pool of all students, there is no direct connection between musical talent and sports proficiency. However, the college’s unique selection criterion gives rise to a strange observation. Once we only consider the admitted students, it seems that those with exceptional musical talents don’t generally excel in sports, and similarly, those who are sports stars generally don’t show much musical ability.
This apparent negative correlation is simply an outcome of the way the college handpicks its students. The selection process inadvertently creates a group that displays a misleading negative association between musical and sporting abilities.
EXPLORING THE RELATIONSHIP
To understand what is going on, imagine that the college only admits students who score 100 in either sports or music, but their talent in the other subject is simply representative of the wider student average in that subject. So among the admitted students, the average score for sports among musically talented students is just 50, contrasting with 100 for music, and vice versa. This results in the illusion of a negative correlation between sports and musical abilities, where no such relationship exists in the wider student population.
BERKSON’S PARADOX IN EVERYDAY LIFE: THE BOOKS AND MOVIES CONUNDRUM
Berkson’s Paradox can creep into our daily lives, sometimes in unexpected ways. A common example revolves around the adaptation of books into movies. Have you ever noticed how often an excellent book seems to become a disappointing movie?
This observation could be a manifestation of Berkson’s Paradox. By focusing on the cases where good books are made into bad movies, we might miss the instances where good books are made into good movies, or bad books into bad movies, or even mediocre books into mediocre movies. These overlooked instances are crucial in determining the true nature of the correlation.
The key point is that there might be many successful adaptations of good books, but these are not as memorable or discussed as the disappointing ones. This leads to a perception bias. To accurately determine if there’s a negative correlation, we would need to systematically analyse a large and representative sample of book-to-movie adaptations, considering all combinations of book and movie quality. Otherwise, we may be inadvertently selecting a skewed sample. The selective memory here mirrors the selection bias of the college in our previous example.
THE ATTRACTIVENESS MISCONCEPTION: AN APPLICATION OF BERKSON’S PARADOX
A slightly tongue in cheek example of Berkson’s Paradox relates to attractiveness and demeanour. It’s not uncommon, so the narrative goes, to observe that attractive people tend to have an unpleasant attitude (‘handsome men are jerks’, in the words of statistician Jordan Ellenberg). Ellenberg explains that this perception can be explained if we tend to avoid or ignore individuals who are both unattractive and unpleasant. This leaves us mixing with plenty of attractive, unpleasant people and plenty of pleasant, unattractive people, but very few unpleasant, unattractive people. In our circles, therefore, we see a negative association between attractiveness and being pleasant even if no such association might exist in the wider population.
BERKSON’S PARADOX: A SOURCE OF BIAS IN SAMPLE CONSTRUCTION
More generally, Berkson’s Paradox serves as a reminder of the potential pitfalls of sample construction. When we focus on certain aspects of interest and ignore others, we risk creating a biased sample. Such a sample may indicate a negative correlation where none exists in the whole population.
BERKSON’S PARADOX IN COVID-19 RESEARCH
A study published in 2020 suggested that smokers were less likely to be hospitalised due to COVID-19. Does this mean that smoking protects from Covid? A more likely explanation is that the study was an example of Berkson’s Paradox. Hospitalised patients were not a random sample of the population. They consisted of older individuals, frail individuals, smokers, and those with COVID-19.
To illustrate the problem, assume for simplicity that patients are admitted either for smoking-related illnesses or COVID-19. Covid tests in hospitals would likely now show lower infection rates among smokers because they’re already hospitalised for smoking-related illnesses. Non-smokers would on the other be admitted because they have Covid. This creates the appearance of a negative association between smoking and rates of COVID-19 infection even though no such negative association exists in the wider population.
HOW TO AVOID THE BIAS
When analysing data, it’s crucial to consider the whole population and avoid focusing on a selective sample that could distort the real picture.
CONCLUSION: INTERPRETING DATA
Berkson’s Paradox teaches us the importance of comprehensive and unbiased data analysis. It reminds us that the way we select and interpret data can shape our understanding of the world. Understanding and recognising this paradox is crucial in various fields, including but not limited to social sciences, medical research, and data analysis.
