Yes these are good points.

I’ll post more on this.

Thanks for the comments.

I think researchers want to learn something from the particular data set that they observed. What may happen for other, hypothetical sets of data that have not been observed is less relevant. In other words, it seems to me that inference needs to be conditional on the specific data you observed. Unconditional frequentist procedures may do well *on average*, but we do not observe average data — we observe a particular data set, and performance of unconditional procedures may be good on average but terrible for a particular data set. In recognition of this we might condition on recognizable subsets, ancillary statistics, etc., but problems remain (there are situations for which these do not exist). The Berger & Wolpert 1988 book on the Likelihood Principle lists many examples where frequentist inference is just plain silly (e.g., you can be 100% confident that a parameter lies in a 50% frequentist confidence interval). Smart frequentists can often patch things up, but only in an ad-hoc fashion.

So if I’m a shopkeeper, and I care about how I do on average, across a series of product tests, say, then _perhaps_ a frequentist procedure might be usefully applied. But researchers don’t want to use procedures because they do well on average, in repeated testing: they want to learn something about a very specific situation, about what their data set tells them about their hypotheses.

Also, I should stress that the Sellke et al. (2001) argument is not a Bayesian argument. They just show that the p-value is much less diagnostic than people think, because it neglects the fact that a particular p-value can also be rare under H1. Berger has an applet that makes the point, and it is a purely frequentist argument.

Cheers,

E.J.

Yes they say they want that, but they don’t understand that it means giving up

coverage.

Ask them to choose between Envelope A or Envelope B so that they can actually

see there is a difference and that they have to make a choice.

I work with some astronomers who often use Bayes but when they do a simulation

and see that they don’t cover 95 percent of the time, they think there is something wrong.

The say they want Bayes, until they really see they don’t get coverage.

Give them a clear stark choice between bayes with low coverage or confidence interval with correct coverage

and see what happens

I think it is actually the opposite: most researchers would actually prefer a bayesian confidence interval (often called “credible interval”), giving them a 95% probability that the true value is in that interval, and many researchers therefore misinterpret the frequentist confidence interval as if it gives them that. !

]]>