When looking at many variables, it is easy to overlook how many possible correlations that are being tested. Multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potential to produce a “discovery.” For example, with a thousand variables, there are almost half a million (1,000×999/2) potential pairs of variables that might appear correlated by chance alone. While each pair is extremely unlikely in itself to show dependence, from the half a million pairs, it is very possible that a large number will appear to be dependent.

Say, for example, more than 20 comparisons are made where there is a 95% confidence level for each. In this case, you may well get a false comparison by chance.  This becomes a fallacy when that false comparison is seen as significant rather than a statistical probability. This fallacy can be addressed by the use of more sophisticated statistical tests.

A classic example of the multiple comparisons fallacy is the Birthday Paradox. In a group of 23 people (assuming each of their birthdays is an independently chosen day of the year with all days equally likely), there is in fact greater than a 50 per cent chance that at least two of the group share the same birthday. This seems counter-intuitive, since it is rare to meet someone that shares a birthday. Indeed, if you select two random people, the chance that they share a birthday is about 1 in 365. With 23 people, however, there are 253 (23×22/2) pairs of people who might have a common birthday. So by looking across the whole group, we are checking whether any one of these 253 pairings, each of which independently has a tiny chance of coinciding, does indeed match. Because there are so many possibilities of a pair , it makes it more likely than not, statistically, for coincidental matches to arise. For a group of as 40 people, say, it is nearly nine times as likely that at least share a birthday than that they do not.

Multiple Comparisons Fallacy. In: Paradoxes of Probability and other statistical strangeness. The Conversation. Woodcock, S. April 4, 2017. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Multiple Comparisons Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/130/Multiple-Comparisons-Fallacy

The Multiple Comparisons Fallacy. Fallacy Files. http://www.fallacyfiles.org/multcomp.html

The Misleading Effect of Noise: The Misleading Comparisons Problem. Koehrsen, W. Feb. 7, 2018. whttps://towardsdatascience.com/the-multiple-comparisons-problem-e5573e8b9578

Birthday Problem. Wikipedia. https://en.wikipedia.org/wiki/Birthday_problem

From → Nutshells