An overview of the Multiple Comparison problem
Multiple Comparison Problem
In 2012, the IgNobel prize was awarded to an fMRI study of a dead salmon since, after multiple testing over voxels, they found significant activity in the dead brain of a salmon.
This study is an example of what is known as Multiple Correction problem, defined in Wikipedia as “the problem that occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values ”. In other words, it is a problem that arises when implementing a large number of statistical tests in the same experiment since, the more tests we do, the higher probability of obtaining, at least, one test with statistical significance.
In the study of the dead salmon, the authors studied the activity of the brain across 130, 000 voxels in a typical fMRI volume. Due to the large number of tests, the probability of obtaining, at least, one false positive was almost certain (as it happened).
Therefore, when running multiple tests, it is important to be aware of this problem so, to warn data scientists, this article aims to:
- Teach how to calculate the probability of obtaining statistical significance between two groups in terms of α and the number of tests.
- Present multiple comparison corrections.
- Run an experiment and display the results using python code.
- Will Koehrsen: https://towardsdatascience.com/the-multiple-comparisons-problem-e5573e8b9578
- Overview: https://towardsdatascience.com/an-overview-of-methods-to-address-the-multiple-comparison-problem-310427b3ba92
Thankyou For Reading!