1. Peculiar Prevalence
I recommend reading the paper to get the full details. The quick summary is that they collected 3,627 p-values from journals. They found a statistical anomaly in the form of an excess just below 0.05. (They fit a parametric distribution to the p-values and demonstrated a significant departure right below 0.05).
There are some obvious explanations. First, selection bias. Studies with p-values above 0.05 are less likely to be published. Second, the tweaking effect. If you do a study and get a p-value just above .05 you might be tempted to tweak the analysis a bit until you get the p-value just below 0.05. I am not suggesting malfeasance. This could be done quite subconsciously.
You might say: well I would have predicted this would happen anyway. Perhaps. But it is quite interesting to see a carefully done study document and quantify the effect.
E.J. Masicampo was kind enough to share the data with me. Here is a histogram of the p-values.
The jump just below 0.05 is quite noticeable.
2. Multiscale Madness
Now I want to raise a meta-statistical issue. Fitting a parametric model and looking for a big residual is certainly a reasonable approach. But can we do this nonparametrically?
The density of p-values is a mixture of the form
where is the fraction of non-null effects, is the uniform density (the density of under the null) and is the density of p-values for the non-null effects. The unknowns are and . It is not unrealistic to assume that is decreasing. This implies that is some decreasing density.
So, statistically, we might want to look for suspicious non-decreasing parts of the density. A good tool for doing this is the distribution free method developed in Dumbgen and Walther (2008). Briefly, the idea is this.
Consider data from a density . Let
be the order statistics. Now construct all the local order statistics
for . These local order statistics are combined into a multiscale test statistic . Then we find the set of intervals
which are the intervals where the density apparently increases. It is possible to explicitly calculate the critical values . One can then claim that all the intervals in must have an increase in the density . The probability that this claim is wrong as at most . This procedure is exact: it is finite sample and distribution free. (Of course, one can construct intervals of decrease as well but for our problem, is the set of interest.)
The procedure is implemented in the R package modehunt, written by Kaspar Rufibach and Guenther Walther. When applied to the p-values (I am ignoring a small technical issue, namely, that the data are rounded) we get the following:
The horizontal lines show the intervals with apparent increases in the density. The result confirms the findings of Masicampo and Lalande. There seems to be suspicious increases just below 0.05.
To quote Arte Johnson,
Dumbgen, L. and Walther, G. (2008). Multiscale Inference about a density. The Annals of Statistics, 36, 1758–1785.
Masicampo, E.J. and Lalande, D. (2012). A peculiar prevalence of p values just below .05. The Quarterly Journal of Experimental Psychology. http://www.tandfonline.com/doi/abs/10.1080/17470218.2012.711335.