Nonparametric Regression, ABC and CNN

On Monday we had an interesting seminar by Samory Kpotufe on nonparametric regression. Samory presented a method for choosing the smoothing parameter, locally, in nonparametric regression. The method is simple and intuitive: construct confidence intervals using many different values of the smoothing parameter. Choose the value at which the confidence intervals stop intersecting. He has a theorem that shows that the estimator adapts to the local dimension and to the local smoothness. Very cool. The idea is similar in spirit to ideas that have been developed by Oleg Lepski and Alex Goldenshluger. I am looking forward to seeing Samory’s paper.

On Wednesday we were fortunate to have another seminar by my old friend Christian Robert. Christian and I have not seen each other for a while so it was fun to have dinner and hang out. Christian spoke about ABC (approximate Bayesian computation). But really, ABC was just an excuse to talk about a very fundamental question: if you replace the data by a summary statistics (not sufficient), when will the Bayes factor be consistent? He presented sufficient conditions and then did some interesting examples. The conditions are rather technical but I don’t think this can be avoided. In our era of Big Data, this type of question will arise all the time (not just in approximate Bayesian computation) and so it was nice to see that Christian and his colleagues are working on this.

On a third, and unrelated note, I was watching CNN today. Someone named Roshini Raj (I believe she is a doctor at NYU) discussed a study from Harvard that showed that many foods, like pasta and red meat, are associated with depression. These reports drive me crazy. I have not looked at the study so I can’t comment on the study itself. But I had hoped that at least one anchor or the doctor would raise the obvious objection: this could be association without being causal. It did not occur to any of them to even raise this question. Instead, they immediately assume it is causal and started talking about the importance of avoiding pasta and red meat. I don’t find pasta or red meat depressing but I do find this kind of bad reporting depressing. Again, it is possible that the paper itself is careful about this; I don’t know. But the reporting sucks.

This entry was written by normaldeviate, posted on October 31, 2013 at 5:54 pm, filed under Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

6 Comments

steve h.

Posted November 1, 2013 at 4:35 am | Permalink

here is the CNN discussion: http://newday.blogs.cnn.com/2013/10/31/study-links-pasta-and-depression/
Andrew Gelman

Posted November 1, 2013 at 5:42 am | Permalink

Hey Larry. I did some googling. The author of the red-meat paper you mentioned also came out a couple years ago with a study suggesting that coffee reduces depression. On the other hand, we’ve heard from other quarters that coffee is a killer too!

To be more serious for a moment, I’m sure the authors of the paper in question addressed the causal issues to some extent, and I expect that in their article they are careful with the causal language. What’s frustrating is that there are a million news reports of this study on the web but I don’t see any link to the scientific paper. It should be standard practice when reporting a scientific study to link to the full paper. (I can’t find the paper on the author’s website either.)
- normaldeviate
  
  Posted November 1, 2013 at 8:33 am | Permalink
  
  Yes that drives me crazy. Why don’t they always put a link to the paper?!
  - Ben
    
    Posted November 1, 2013 at 9:11 am | Permalink
    
    Because only weirdo academics only want a link to the study, of course! What fraction of CNN’s market is actually interested in following up in this kind of detail?
  - Keith O'Rourke
    
    Posted November 1, 2013 at 4:02 pm | Permalink
    
    On the other hand, I usually get the link, then have to read the paper, note reasonable attempts to address causal issues and use of careful language, but a complete failure to admit there is almost no credible information given that or that the suggested further search has prior probability well under 50% of going somewhere purposeful.
    
    For instance, the study is barely able to pick up known effects that are likely much larger, but then discerns one statistically significant lead out of four attempts (with no adjustment for multiplicity) – fortunately is Friday 😉
WW

Posted November 1, 2013 at 6:23 pm | Permalink

ABC: What about other model selection tools such as a scoring rule based on the predictive power?

Response to CNN: http://www.youtube.com/watch?v=1j2Duy_xzEA

Normal Deviate