Nate Silver is a Frequentist: Review of “the signal and the noise”

Nate Silver Is A Frequentist
Review of “the signal and the noise” by Nate Silver

There are not very many self-made statisticians, let alone self-made statisticians who become famous and get hired by the New York Times. Nate Silver is a fascinating person. And his book the signal and the noise, is a must read for anyone interested in statistics.

The book is about prediction. Silver chronicles successes and failures in the art of prediction and he does so with clear prose and a knack for good storytelling.

Along the way, we learn about his unusual life path. He began as an economic consultant for KPMG. But his real passion was predicting player performance in baseball. He developed PETOCA, a statistical baseball analysis system which earned him a reputation as a crack forecaster. He quit his day job and made a living playing online poker. Then he turned to political forecasting, first at the Daily Kos and later at his own website, FiveThirtyEight.com. His accurate predictions drew media attention and in 2010 he became a blogger and writer for the New York Times.

The book catalogues notable successes and failures in prediction. The first topic is the failure of ratings agencies to predict the bursting of the housing bubble. Actually, the bursting of the bubble was predicted, as Silver points out. The problem was that Moody’s and Standard and Poor’s either ignored or downplayed the predictions. He attributes to failure to having too much confidence in their models and not allowing for outliers. Basically, he claims, they confused good “in-sample prediction error” as being the same as “good out-of-sample prediction error.”

Next comes a welcome criticism of bogus predictions from loud-mouthed pundits on news shows. Then, a fun chapter on how he used relatively simple statistical techniques to become a crackerjack baseball predictor. This is a theme that Silver touches on several times. If you can find a field that doesn’t really on statistical techniques, you can become a star just by using some simple, common sense methods. He attributes his success at online poker, not to his own acumen, but to the plethora of statistical dolts who were playing online poker at the time.

He describes weather forecasting as a great success detailing the incremental, painstaking improvements that have taken place over many years.

One of the striking facts about the book is the emphasis the Silver places on frequency calibration. (I’ll have more to say on this shortly.) He draws a plot of observed frequency versus forecast probability for the National Weather Service. The plot is nearly a straight line. In other words, of the days that the Weather Service said there was a 60 percent chance of raining, it rained 60 percent of the time.

Interestingly, the calibration plot for the Weather Channel shows a bias at the lower frequencies. Apparently, this is intentional. The loss function for the Weather Channel is different than the loss function for the Nation Weather Service. The latter wants accurate (calibrated) forecasts. The Weather Channel wants accuracy too, but they also want to avoid making people annoyed. It is in their best interests to over-predict rain slightly for obvious reasons: if they predict rain and it turns out to be sunny, no big deal. But if they predict sunshine and it rains, people get mad.

Next come earthquake predictions and economic predictions. He rates both as duds. He goes on to discuss epidemics, chess, gambling, the stock market, terrorism, and climatology. When discussing the accuracy of climatology forecasts he is way too forgiving (a bit of political bias?). More importantly, he ignores the fact that developing good climate policy inevitably involves economic prediction, to which he already gave a failing grade. (Is it better to spend a trillion dollars helping Micronesia develop a stronger economy so they don’t rely so much on farming close to the shore, or to spend the money on reducing carbon output and hence delay rising sea levels by two years? Climate policy is inextricably tied to economics.)

Every chapter has interesting nuggets. I especially liked the chapter on computer chess. I knew that Deep Blue beat Gary Kasparov but beyond that, I didn’t know much. The book gives lots of juicy details.

As you can see, I liked the book very much and I highly recommend it.

But …

I have one complaint. Silver is a big fan of Bayesian inference, which is fine. Unfortunately, he falls into that category I referred to a few posts ago. He confuses “Bayesian inference” with “using Bayes’ theorem.” His description of frequentist inference is terrible. He seems to equate frequentist inference with Fisherian significance testing, most using Normal distributions. Either he learned statistics from a bad book or he hangs out with statisticians with a significant anti-frequentist bias.

Have no doubt about it: Nate Silver is a frequentist. For example, he says:

“One of the most important tests of a forecast — I would argue that it is the single most important one — is called calibration. Out of all the times you said there was a 40 percent chance of rain, how often did rain actually occur? If over the long run, it really did rain about 40 percent of the time, that means your forecasts were well calibrated.”

It does not get much more frequentist than that. And if using Bayes’ theorem helps you achieve long run frequency calibration, great. If it didn’t, I have no doubt he would have used something else. But his goal is clearly to have good long run frequency behavior.

This theme continues throughout the book. Here is another quote from Chapter 6:

“A 90 percent prediction interval, for instance, is supposed to cover 90 percent of the possible real-world outcomes, … If the economists’ forecasts were as accurate as they claimed, we’d expect the actual value for GDP to fall within their prediction interval nine times out of then …”

That’s the definition of frequentist coverage. In Chapter 10 he does some data analysis on poker. He uses regression analysis with some data-splitting. No Bayesian stuff here.

I don’t know if any statisticians proof-read this book but if they did, it’s too bad they didn’t clarify for Silver what Bayesian inference and frequentist inference really are.

But perhaps I am belaboring this point too much. This is meant to be a popular book, after all, and if it helps to make statistics seem cool and important, then it will have served an important function.

So try not to be as pedantic as me when reading the book. Just enjoy it. I used to tell people at parties that I am an oil-fire fighter. Now I’ll say: “I’m a statistician. You know. Like that guy Nate Silver.” And perhaps people won’t walk away.

This entry was written by normaldeviate, posted on December 4, 2012 at 7:53 pm, filed under Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

47 Comments

Andrew Gelman

Posted December 4, 2012 at 8:33 pm | Permalink

Larry:

There is such a thing as Bayesian calibration of probability forecasts. If you are predicting a binary outcome y.new using a Bayesian prediction p.hat (where p.hat is the posterior expectation E(y.new|y), then Bayesian calibration requires that E(y.new|y.hat) = y.hat for any y.hat. This isn’t the whole story (as always, calibration matters but so does precision), but it’s not the same as frequentist calibration or unbiasedness. In frequentist calibration, the expectation is taken conditional on the value of the unknown parameters theta in the model. The calibration you describe above (for another example, see here and scroll down) is unconditional on theta, thus Bayesian. So I disagree with you that those calibrations are frequentist and not Bayesian. But of course I completely agree with you that the concept of frequency performance of methods is important. It’s just that Bayesian calibration does not condition on theta.
- normaldeviate
  
  Posted December 4, 2012 at 8:53 pm | Permalink
  
  Hi Andrew
  
  Thanks for the comment.
  
  I’m not sure I agree that he is conditioning in the way you say.
  (But maybe you were making a general statement and not referring to him?)
  
  But anyway, my bigger point is that he confuses “Bayesian inference” with
  `applying Bayes’ theorem.”
  And is description of frequentist inference is way out there.
  
  Larry
murbard

Posted December 4, 2012 at 9:22 pm | Permalink

In what world is caring about good calibration not Bayesian? As a good Bayesian, you know your models are fallible, and you can they may even fail predictably. Thus, you may want to have a prior about how biased your model is likely to be, observe the cross-entropy of your predictions and reality, and update your estimate of the bias. Basically, Bayesianism doesn’t preclude doing model output statistics.
- normaldeviate
  
  Posted December 4, 2012 at 9:30 pm | Permalink
  
  That’s fine.
  I certainly have no objection to doing that.
  But now, what is the difference between Bayesian and Frequentist inference?
  How would you define them?
  Is a Bayesian who tries to match long run frequency behavior, a Bayesian or a frequentist?
  - Arthur B.
    
    Posted December 5, 2012 at 9:42 am | Permalink
    
    I personally (and others may disagree) define the difference between Bayesian and frequentist by how they think about the ontological nature of probability.
    
    To the frequentist, probabilities *are* the long run frequencies. A probability estimate is correct in so far as it approaches the true long run frequency of a repeated event. To the Bayesian, probabilities represent beliefs about an event. The Bayesian log-probability of E can be thought of as the information gain upon observing E.
    
    In practice, many hypothesis we’re interested in have no long run frequencies (e.g. does smoking cause lung cancer). The frequentist has no concept of a probability for such an event and thus relies on statistical tests to validate or reject the hypothesis. The Bayesian, in contrast can assign a probability to the event.
    
    All models are wrong, but it can be a good idea to introduce a latent slack variable in the model to correct some simple forms of bias. The Bayesian will thus care about the long run frequency of its predictions only insofar as it is evidence used to update the belief about the latent slack variable, not as a form of ontological truth it’s trying to approximate..
  - normaldeviate
    
    Posted December 5, 2012 at 10:30 am | Permalink
    
    I agree with you Arthur.
    And that’s why I am saying nate is a frequenist
    even if he claims otherwise.
    
    Larry
A. Winkler

Posted December 4, 2012 at 9:55 pm | Permalink

Nice review! I’m halfway through the book and thinking that maybe Silver originally wanted just successful predictions and didn’t bother much on being formally correct or having his strategies all falling under a certain logic (i.e., let say, some academicism). My take this far is that the stronger Bayesian emphasis may have come somewhat “a posteriori”, perhaps as a framework needed to put things into place for a cogent book.

It’s a good book, I have no doubt — this includes the footnotes, that are inconveniently hidden at the far end, not at the foot of the pages (at least in the UK edition). They are pretty interesting to be missed.
Andrew Gelman

Posted December 4, 2012 at 10:04 pm | Permalink

Larry:

In response to your response to me: I haven’t read Nate’s book so I’m not sure what he does. But I expect his calibration is Bayesian. Just about any purely data-based calibration will be Bayesian, as we never know theta. Thus, I disagree with your statement that, by doing calibration, Nate is being frequentist. I agree that he’s looking at frequency properties, but frequentist calibration as usually defined in textbooks is conditional on theta. If you allow the term “frequentist” to include calibrations that average over theta, then I’d describe Nate’s calibrations (as i understand them from your description) to be both frequentist and Bayesian; they’re Bayesian frequency calculations. I’d be happy with that.

In response to your response to Murbard: I agree with everything that Murbard says above. In answer to your question, “But now, what is the difference between Bayesian and Frequentist inference?”, the answer is that the frequentist (as I understood it) conditions on theta, while the Bayesian averages over theta. That can be a big difference in settings where theta is high dimensional or where there is otherwise great uncertainty in theta. Frequentists are aware of this problem and have ways of handling it (for example, by averaging over some but not all components of theta), but it is a concern.
- normaldeviate
  
  Posted December 5, 2012 at 8:14 am | Permalink
  
  I respectfully disagree Andrew.
  Perhaps we’ll just have to agree to disagree.
  I discussed this at length here
  
  WHAT IS BAYESIAN/FREQUENTIST INFERENCE?
  
  so I won’t re-hash the arguments.
  
  Larry
  - Carlos Cinelli
    
    Posted December 5, 2012 at 9:03 am | Permalink
    
    Larry, if someone asks:
    
    Is data available today good evidence that Obama will win the election? How good?
    
    Most “frequentist” practioneers, I mean, scientists that use frequentist methods, would say: “the results of the polls show that Obama’s electoral vote is (or is not) significantly different from 270”. And with that result in hand, he would try to say whether or not there is good evidence that obama will win.
    
    On the other hand, a bayesian practioneer, like Nate Silve, would say: “yes, the data today suggest that Obama is very likely to win, with 91% probability” or “no, the data suggest that Obama has only 20% chance of winning”.( and some “frequentists” would argue (as they did) that this answer is nonsense, for the election has not happened yet and “elections” do not have “distributions”)
    
    Do you agree that these would be the standard answers?
    
    That does not seem to me a difference in the goal, but indeed a difference in the method to answer the very same question.
  - normaldeviate
    
    Posted December 5, 2012 at 9:09 am | Permalink
    
    No I don’t.
    What made Nate predict the election well was that he combined polls
    using standard statistical methods, rather than using
    a single poll. A good frequentist analysis would have confidence
    sets for the number of electoral college which wold probably
    look the same as Nate’s Bayesian intervals.
  - Andrew Gelman
    
    Posted December 5, 2012 at 9:05 am | Permalink
    
    Larry:
    
    I don’t understand. What exactly do you disagree with? Are you saying that a frequentist calibration does not need to average over theta? If so, I can accept that; it’s just different from what I remember seeing in textbooks.
  - normaldeviate
    
    Posted December 5, 2012 at 9:15 am | Permalink
    
    What I mean is this.
    By reading his book, my impression is that
    he likes to use Bayes theorem but his statistical goals
    are all focused on long run frequencies.
    Bayes’ theorem is used by statisticians of all stripes.
    He explicitly criticizes frequentist statistics and then proceeds
    to elevate long run frequencies to a high status, which, in my opinion,
    is the definition of frequentist inference.
    There are plenty of sensibly Bayesians (like you) who are happy to
    appeal to frequency properties. But then you wouldn’t go on to criticize those
    very frequentist things.
  - Keith O'Rourke
    
    Posted December 5, 2012 at 9:40 am | Permalink
    
    But Larry you did criticize Nate for
    “Either he learned statistics from a bad book or he hangs out with statisticians with a significant anti-frequentist bias.”
    
    Now if overlooking this “The average coverage (averaged with respect to the Bayesian’s prior) is 1-alpha.
    You can then think of the frequentist coverage as a sort of robust Bayesian idea; the average coverage is at least 1-alpha when average over any prior.” defines a bad Stats book then almost all are bad?
    
    Or if seeming to pretend to not know it defines a “significant anti-frequentist (or Bayes) bias” then my guess is that 80%+ of statisticians who did Phd’s at one time are in that category (most are not in academia).
    
    I think we all sort know (or think we know) how this type of ignorance came about – how do we fix it?
  - normaldeviate
    
    Posted December 5, 2012 at 10:33 am | Permalink
    
    I agree with your robustness interpretation.
    
    I don’t really want to faut him too much
    for his poor description of frequentist inference.
    He’s not an academic after all.
    But I am curious how he got to think of it this way.
  - Carlos Cinelli
    
    Posted December 5, 2012 at 11:05 am | Permalink
    
    Hi Larry,
    
    The problem I see in your definition is that you are atributing only to frequentists the focus on frequencies, and surely that is not true. Applied bayesians, in science or in forecast,as far as I know do care about frequencies, they just condition their frequencies in a different manner.
    
    In this sense, someone could make a bayesian analysis, say, with quantified subject prior information elicited from experts, and if the goal of the analisys is to have its predictions working accurately we would call it “frequentist” even though it is clearly different from “fisherian” or “neyman-pearson” methods…
  - normaldeviate
    
    Posted December 5, 2012 at 11:29 am | Permalink
    
    I agree Carlos.
    It is not so black and white.
    That’s my point.
    Silver makes it seem as if it is black and white.
  - Keith O'Rourke
    
    Posted December 5, 2012 at 2:02 pm | Permalink
    
    Sorry Larry, I thought my quotation marks were enough to indicate the statements came from you in the posting you referenced.
    
    Mostly agree, but would not use the term “robustness” as there is no sense of trimming (excluding extreme parameter values from consideration) as in Stigler’s recent insightful paper on robustness.
    
    I would prefer a survey about misunderstandings, but my sense is that almost all learning energy is dissipated on getting the derivations, proofs, and now programing and almost none is left over for trying to grasp how/why statistics enables “on average” less wrongness and or misunderstanding of the world. Also students in mostly Bayesian departments quickly learn to be disparaging (but very vague) of Frequentist concepts and vice versa (even if the faculty want open discussion).
Rhiannon Weaver

Posted December 5, 2012 at 8:15 am | Permalink

Outside of the realm of statisticians, I found that understanding of “Bayes” falls into one of about three categories: Machine Learning classification enthusiasts (who generally have lots and lots of data, or simple models replicated everywhere), Fuzzy Logic folks arising from the debate between Dempster and Pearl on how to represent knowledge in true artificial “brains” where there is less of an idea of replication and “the next X”, and “For-profit risk analysts” who offer “BAYES!” as the solution for everything in business seminars, sometimes at the expense of setting up very simple classical alternative straw-man arguments and not always understanding the implications of subjective probability. The idea of long-run frequency properties of Bayesian methods is basically a “check your models” argument, right? And Jay showed me a short paper he’d written that shows that the only “calibration curve” that is truly Bayesian (for calibrating *priors*) is the identity curve. But that is maybe a different kind of calibration: the kind where you “calibrate” yourself by comparing your uncertainty measurements on a ton of completely unrelated facts against the answers and see if your percentages line up. That seems like what Andrew was saying would be “conditioning on theta”. I attended a workshop once where we did this. It seemed to me to be contrary to the notion that any subjective prior is valid. Conditioning on data would seem to me to be more of a case of sequential learning than that kind of calibration? Or some kind of confirmatory model fit or some such.

There was a flurry of interest a few months ago in “BBN”s where I work so I gave a talk on them and tried to show people that they were actually a rather restrictive sub-class of the general hierarchical model that can be represented by a DAG. Not sure how much that helped. For the folks who are like, “BBNs will solve my problem!” I tried to get across that it was like me as a software engineer saying “C++ will solve my problem!”
- hereismyhandle
  
  Posted April 8, 2013 at 1:57 pm | Permalink
  
  “I attended a workshop once where we did this.”
  
  Can you say which?
Entsophy

Posted December 5, 2012 at 8:21 am | Permalink

Let r_i =1 if it rained on the ith day and r_i=0 if it didn’t rain on the ith day. There is absolutely nothing preventing a Bayesian considering P(r_1,….,r_n). A Bayesian will often find relationships between the marginal distributions P(r_i) and the frequency of rain f=\sum_i r_i/n which will be reminiscent of Frequentist calibration.

There is nothing anti-bayesian about this. On the other hand, a Frequentist or a Bayesian who insists on calibration will require the P(r_i) to be approximately equal to f. This places a set of sever restraints on the general probability distribution P(r_1,…,r_n). Effectively you’ve tied both arms behind your back because you are only considering a tiny subset of possible P(r_1,…,r_n).

You can see that at some point this will become harmful, because as our state of knowledge K improves then P(r_1,…,r_n|K) will concentrate about the true sequence r_1,…,r_n of rain/non-rain days. In that case the marginal distributions P(r_i|K) will all get close to 0 or 1 and move away from f=\sum_i r_i/n. In other words, it is no longer possible for the probability of rain on the ith day to be approximately equal to frequency of rain.

The Frequentist’s great sin isn’t that they’re wrong, it’s that they take an important special case [when P(r_i) =f ] and try to bend every application of probability into that special case regardless of how unnatural it is.
- Entsophy
  
  Posted December 5, 2012 at 8:45 am | Permalink
  
  I might add that Frequentists overcome this problem while retaining the idea that p=f, by thinking of r_1,…,r_n one sequence of many such sequences, and then imagining that P(r_1,…,r_n) is equal to the frequency of such sequences. Most of the time this is pure fantasy. In some cases, it isn’t possible to even imagine this at all. When this happens Frequentists just deny that probability can be applied.
  
  The insanity of all this, is that it is completely unnecessary to indulge in such fantasies or placing any such artificial restrictions on probability theory. To predict the one sequence r_1,…,r_n that we actually care about all we need is a P that places the actual sequence in the high probability manifold P(r_1,…,r_n|K). It’s completely irrelevant whether you can imagine multiple repetitions of r_1,…,r_n or not!
John White

Posted December 5, 2012 at 9:45 am | Permalink

Have a look at

http://www.electoral-vote.com

and compare the stuff with Nate Silver’s work
bayesrules

Posted December 5, 2012 at 10:03 am | Permalink

The information that the Weather Channel fudges the low end of rain predictions is quite interesting. I’ve taught an elementary non-calculus course on decision theory to freshmen/sophomore honors college students a number of times. I describe the loss function for carrying/not carrying an umbrella as follows: If you carry the umbrella and it rains, zero loss. If you don’t carry it and it doesn’t rain, zero loss. If you don’t carry it and it rains, you get wet, big loss. If you carry it and it doesn’t rain, you look like a dork and you have to decide how bad that is compared to rain/no umbrella.

I had one student who said that she loves to get rained on. Her loss function is very different from mine. My umbrella is attached to my backpack, so I often look like a dork.
Mayo

Posted December 5, 2012 at 1:45 pm | Permalink

I haven’t read Silver, so I don’t know why (or if?) Larry implicitly seems to be granting his criticisms of significance tests: are Fisherian tests incapable of ensuring low erroneous rejections of nulls? Are they incapable of ensuring, say, that the relative frequency of erroneously inferring a genuine effect or genuine discrepancy from a null hypothesis is low?
One doubtless shouldn’t write comments in noisy public places with irratic internet, but it seems to me that this discussion (so far) is overlooking the huge difference between (deductively) predicting/assigning probabilities to the occurrence of events, and appraising hypotheses presumed correct/incorrect or approximately true/discrepant to some extent, etc.(as approx.. descriptions of an aspect of what brought about the data). Gelman speaks of the Bayesian averaging over theta—and he is exactly right that this is a key difference—would this be to appeal to possible worlds or to other hypotheses (about this world) similar(in some sense) to this one? Should we look at urns of null hypotheses (for a prior)? A frequentist could, in some special cases, assign a legitimate probability to a hypotheses, but the idea that one gets a frequentist probability of H being equal to p, because H is selected from a universe of hypotheses, p% of which are, or have been thought to be, true, is mistaken. In short: I haven’t read Silver’s book, but because he appears to be dealing with predicting the occurrence of events, it would appear irrelevant to the central issue of frequentist/Bayesian statistical inference. I quite agree with him that what Silver appears to be cheering as Bayesian is just relative frequency analysis.
- normaldeviate
  
  Posted December 5, 2012 at 1:54 pm | Permalink
  
  Yes. Your last sentence is the point I was trying to make
Corey

Posted December 5, 2012 at 11:08 pm | Permalink

“He describes weather forecasting as a great success detailing the incremental, pain-staking improvements that have taken place over many years.”

While it’s not unheard of for one to stake pain on the outcome of one’s work, surely it’s more common to take pains with one’s work?
- normaldeviate
  
  Posted December 6, 2012 at 8:37 am | Permalink
  
  oops
  good catch
xi'an

Posted December 6, 2012 at 1:41 am | Permalink

Would you mind publishing this review in CHANCE, by any chance???
- normaldeviate
  
  Posted December 6, 2012 at 8:37 am | Permalink
  
  Sure!
Tao Shi

Posted December 6, 2012 at 2:01 pm | Permalink

Reblogged this on Learning From Data and commented:
Found it interesting to see Larry and Andrew debate on if Nate Silver, a now world famous self-made statistician, is a frequentist or a bayesian.
- Tao Shi
  
  Posted December 6, 2012 at 2:48 pm | Permalink
  
  Instead of doing inference on Nate based on reading (or not reading) his book, how about asking Nate about it? Maybe he will be confused by the definition of “frequentist” or “bayesian”.
  - normaldeviate
    
    Posted December 6, 2012 at 2:49 pm | Permalink
    
    How?
    I don’t know him.
  - Tao Shi
    
    Posted December 6, 2012 at 2:51 pm | Permalink
    
    How about we contact NY Times or simply post on his blog.
  - normaldeviate
    
    Posted December 6, 2012 at 2:52 pm | Permalink
    
    go for it!
  - Tao Shi
    
    Posted December 6, 2012 at 2:53 pm | Permalink
    
    Will do! Should be a fun experiment. BTW, Do we randomized on who is going to ask? 🙂
  - Arthur B.
    
    Posted December 6, 2012 at 2:54 pm | Permalink
    
    You can also probably email him at the address found at the end of this paper
    
    Click to access probdecisive2.pdf
  - Tao Shi
    
    Posted December 6, 2012 at 3:36 pm | Permalink
    
    Just twittered “@fivethirtyeight A few distinguished statisticians try to guess if you are a frequentist or Bayesian. https://normaldeviate.wordpress.com/2012/12/04/nate-silver-is-a-frequentist-review-of-the-signal-and-the-noise/ … Your thought?” Let’s see what happens.
  - normaldeviate
    
    Posted December 6, 2012 at 6:16 pm | Permalink
    
    Great
    Let’s see what he says
Mayo

Posted December 6, 2012 at 6:47 pm | Permalink

Given especially what Larry has written about him, I fail to see why one would expect to learn “what he is” by asking him. Of course, it would be good to correct terminological errors.
- Keith O'Rourke
  
  Posted December 7, 2012 at 11:00 am | Permalink
  
  Mayo: You are right its too late, in fact once he was done it was too late.
  
  At least Herbert Simon discovered you have to have people verbalize out load while they were doing the work to discern what what they _were_ thinking when they did it. Asking afterwards just fetches a socially desirable recreation fiction.
  
  My master thesis in Biostatistics was supposed to do this with statisticians doing power calculations for clinical trials, there were very few volunteers and those that did withdrew as the above was explained to them.
  - Mayo
    
    Posted December 8, 2012 at 8:25 am | Permalink
    
    Keith: just curious: you were to ask them what power calculations they used? or about their philosophies?
Rush

Posted December 9, 2012 at 6:50 pm | Permalink

Larry said: “He explicitly criticizes frequentist statistics and then proceeds to elevate long run frequencies to a high status, which, in my opinion, is the definition of frequentist inference.”

That’s a paralogism, Larry. The frequencies in Nate’s book are real observed frequencies. Standard Frequentism relies on the frequency of things that didn’t happen. Also, if you’re really concerned about fundamentals, you can’t force things and just interpret confidence sets as probability sets. Frequentists such as Cox are well aware of that (as he discussed in the recent interview by Mayo). Given the results of some real poll, confidence sets will have probability one for some values of theta, and probability zero for other values of theta. This is the Achilles heel of Frequentism.
- Mayo
  
  Posted December 11, 2012 at 8:07 am | Permalink
  
  Probabilities derived from a probability model are of use to the extent that they are close to actual relative frequencies. If one is characterizing a method in terms of its reliability, precision, accuracy, margin of error and the like, one invariably considers its general capabilities, alluding to counterfactual/hypothetical measurements or expected outcomes. That is what it means to be talking about the capacity of a tool, be it a polling method, or a hypothesis, to predict reliably. No scientific claim of interest would result from just reporting what you observed happened (and even that has to be described in a certain way). No Achilles heel, but the nature of science.
apdawid

Posted December 11, 2012 at 6:00 pm | Permalink

It is not at all “unBayesian” to hope that there might/should be some relationship between your personal probabilities and the real world. Calibration and other “frequentist” criteria can address this issue. Even de Finetti distinguished between “good” and “bad” probability appraisers.

This topic has exercised me greatly over very many years, and I have written extensively on and around it: see the reading list at . In particular:

Dawid, A. P. (1982). The well-calibrated Bayesian (with Discussion). J. Amer. Statist. Ass. 77, 605-613.

Dawid, A. P. (1986). Probability Forecasting. Encyclopedia of Statistical Sciences vol. 7, edited by S. Kotz, N. L. Johnson and C. B. Read. Wiley-Interscience, 210-218.

Also (not in that list):

Dawid, A. P. (2004). Probability, causality and the empirical world: A Bayes-de Finetti-Popper-Borel synthesis. Statistical Science 19, 44-57.

Dawid, A. P. and Galavotti, M. C. (2009). de Finetti’s subjectivism, objective probability, and the empirical validation of probability assessments. In “Bruno de Finetti, Radical Probabilist” (M. C. Galavotti, Ed.). London: College Publications, 97-114.

Philip
- apdawid
  
  Posted December 11, 2012 at 6:02 pm | Permalink
  
  My reading list link didn’t come out, so trying again:
  
  Click to access preqpubs.pdf
  
  Philip
- normaldeviate
  
  Posted December 11, 2012 at 6:20 pm | Permalink
  
  I agree with that.
  But his book gives a distorted impression of
  what bayesian and frequentst inference really are.
  
  Thanks for the references.

11 Trackbacks

By Lecture: j’entend Silver, le renard et le hérisson | Matières Vivantes on January 5, 2013 at 8:33 pm

[…] Par exemple, sur les marchés, il explique très bien les effets de meute an trading, jouant l’effet de feedback positifs très fort. Mais dans ce cas, on s’attend à avoir des systèmes multistables, avec des effets d’hystérèse (les économistes parleraient peut-être de dépendance au sentier): en particulier, cela veut dire que des optima locaux peuvent être stables, et donc que le marché peut échouer à trouver les optimum globaux. Pire, on peut avoir des dynamiques analogues à celle des verres en physique. Silver, bien qu’il cite la théorie des systèmes complexes, ignore complètement ce genre d’effets et semble en fait rester bloqué sur l’idée qu’il n’y aurait qu’un unique optimum (un peu platonicien) sur le marché autour duquel fluctuent les prix (ce qui me semble impossible en vertu même des effets qu’il invoque). En fait, c’est une critique générale qui revient souvent sur Silver: il ne va pas complètement au bout des principes qu’il énonce, et il se pourrait même qu’il soit en réalité plus fréquentiste que bayésien … […]
By 2012看过的几本书 | 六味斋 on January 18, 2013 at 9:44 pm

[…] 这本书在美国大选结束以后在亚马逊销售榜上节节攀升，因为作者成功预测了每个州的总统选举结果。这本书就是讲用统计模型进行预测的，作者是学统计出身。内容包括了各类江湖话题，比如金融，政治，经济，天气，地震，流行病，棒球，国际象棋，德州扑克，股票等等。这本书让很多统计学者很兴奋，因为据说这是第一次统计的科普书走进大众视线并且还广受关注。很多统计学者总是一副愤世嫉俗的样子，觉得很多人觉得统计不是科学，所以自己也似乎心虚又不甘心，这次终于有人给出了次头。不过Silver学统计就是大学毕业，水平可能确实有限。他对贝叶斯定理的科普写的不错，但是他对频率派的打击却缺乏论据难以服众。这里有一位统计学者的blog: https://normaldeviate.wordpress.com/2012/12/04/nate-silver-is-a-frequentist-review-of-the-signal-and-… […]
By Risks of Predictive Analytics - Data Community DC on February 28, 2013 at 9:03 am

[…] Consider providing the 20th percentile of a forecast distribution as the target. If your model is well-calibrated, those forecasts will be met 80% of the time. There is extensive psychological and business […]
By A bad graph » Source-Filter on May 17, 2013 at 7:35 pm

[…] illustrations scattered throughout). You can read a brief review of the content of the book at Normal Deviate (with lots of interesting discussion in the comments) as well as a discussion of two other reviews […]
By Why saying you are a Bayesian is a low information statement | Dynamic Ecology on June 19, 2013 at 8:25 am

[…] he appears to mean this in the historical Bayesian sense, iteratively improving and correcting. As Wasserman demonstrates, Silver’s definition of probability is about as frequentist and non-subjective as you can […]
By Book review: The Signal and the Noise by Nate Silver | Dynamic Ecology on June 25, 2013 at 4:23 pm

[…] thing I didn’t like is something Larry Wasserman and Brian also picked up on: Silver’s confusion about what it means to be […]
By Entsophy on August 5, 2013 at 3:57 pm

[…] illustrious Dr. Mayo recently reminded me of Larry Wasserman’s take on what is a Frequentist. In Larry’s […]
By Big Data – Ponderings of a Practitioner – I | EngKraft on November 8, 2013 at 9:16 pm

[…] Signal and the Noise: Why So Many Predictions Fail-but Some Don’t . (For those interested, a quick review from an eminent […]
By The Signal and the Noise – Nate Silver (2012) | Quartet on December 9, 2013 at 5:50 pm

[…] boom and then political forecasting on the internet. People who know a little bit about statistics have found some fault in the book, not least his damning view of something called frequentism (in contrast to a Bayesian philosophy) […]
By The Signal and the Noise – Nate Silver (2012) | Form Study on December 22, 2013 at 5:17 pm

[…] boom and then political forecasting on the internet. People who know a little bit about statistics have found some fault in the book, not least his damning view of something called frequentism (in contrast to a Bayesian philosophy) […]
By What counts as an upset? » Source-Filter on July 8, 2014 at 9:40 pm

[…] discussing the World Cup and their predictions based on what I can only assume is a fancy-pants Bayesian statistical model (done in Excel, […]

Normal Deviate