Rise of the Machines

The Committee of Presidents of Statistical Societies (COPSS) is celebrating its 50th Anniversary. They have decided to to publish a collection and I was honored to be invited to contribute. The theme of the book is Past, Present and Future of Statistical Science.

My paper, entitled Rise of the Machines, can be found here.

To whet your appetite, here is the beginning of the paper.

RISE OF THE MACHINES
Larry Wasserman

On the 50th anniversary of the Committee of Presidents of Statistical Societies I reflect on the rise of the field of Machine Learning and what it means for Statistics. Machine Learning offers a plethora of new research areas, new applications areas and new colleagues to work with. Our students now compete with Machine Learning students for jobs. I am optimistic that visionary Statistics departments will embrace this emerging field; those that ignore or eschew Machine Learning do so at their own risk and may find themselves in the rubble of an outdated, antiquated field.

1. Introduction

Statistics is the science of learning from data. Machine Learning (ML) is the science of learning from data. These fields are identical in intent although they differ in their history, conventions, emphasis and culture.

There is no denying the success and importance of the field of Statistics for science and, more generally, for society. I’m proud to be a part of the field. The focus of this essay is on one challenge (and opportunity) to our field: the rise of Machine Learning.

During my twenty-five year career I have seen Machine Learning evolve from being a collection of rather primitive (yet clever) set of methods to do classification, to a sophisticated science that is rich in theory and applications.

A quick glance at the The Journal of Machine Learning Research (\url{mlr.csail.mit.edu}) and NIPS (\url{books.nips.cc}) reveals papers on a variety of topics that will be familiar to Statisticians such as: conditional likelihood, sequential design, reproducing kernel Hilbert spaces, clustering, bioinformatics, minimax theory, sparse regression, estimating large covariance matrices, model selection, density estimation, graphical models, wavelets, nonparametric regression. These could just as well be papers in our flagship statistics journals.

This sampling of topics should make it clear that researchers in Machine Learning — who were at one time somewhat unaware of mainstream statistical methods and theory — are now not only aware of, but actively engaged in, cutting edge research on these topics.

On the other hand, there are statistical topics that are active areas of research in Machine Learning but are virtually ignored in Statistics. To avoid becoming irrelevant, we Statisticians need to (i) stay current on research areas in ML and (ii) change our outdated model for disseminating knowledge and (iii) revamp our graduate programs.

The rest of the paper can be found here.

This entry was written by normaldeviate, posted on February 16, 2013 at 5:13 pm, filed under Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

18 Comments

Ronert Obst

Posted February 17, 2013 at 4:17 am | Permalink

What is the proper citation information for this paper? I would like to quote from it in a seminar about future directions of statistics.
- normaldeviate
  
  Posted February 17, 2013 at 8:45 am | Permalink
  
  good question
  I guess, for now, it is simply: To appear
David Rohde

Posted February 17, 2013 at 7:20 am | Permalink

A very interesting paper!

My perspective looking the other direction from machine learning to stats is.

Things I like about machine learning…. more room for creativity, interest in philosophy, interest in building working statistical systems and new or unconventional problems.

Things I like about statistics…. statistics seems to have nailed the language for expressing statistical problems… this is a big one in my opinion… machine learning sometimes reinvents inferior wheels here imo.

I don’t like about the machine learning conference culture…. lack of recognition for your conference paper work by other disiplines… also ability to participate is constrained by geography and funding…. both have been real problems to me.

Something that concerns me is that statistics is viewed by applied scientists as a preechy dicipline that will point to things they are doing wrong.. I am not sure why this is the case, but I think it is unfortunate an unhelpful… an advantage of being a machine learning person is that scientists rightly or wrongly don’t expect you to lecture them.

I agree with the overall conclusion that the fields are complimentary and am happy that machine learning is getting more recognition… and the backlash from the neural net hype days is gone.
- normaldeviate
  
  Posted February 17, 2013 at 8:48 am | Permalink
  
  The preechy thing is true.
  After all, statisticians are sometimes the
  policemen on a grant who are theee to
  say, “sorry, your results are not significant.”
  In a sense, statisticians are professional skeptics.
  - fred
    
    Posted February 22, 2013 at 8:50 pm | Permalink
    
    Agree that statisticians shouldn’t just be there to say “sorry, p is bigger than alpha, so get lost”. Also agree that (as per Fisher) it’s better to help design good experiments, when possible, than to diagnose and preach about what terrible ones died of.
    
    But there’s a place for informed skepticism in science, right? (Nullius in verba, after all). When a statistician (or anyone else) reviews a grant/paper and reasonably says “sorry, your results are nowhere near precise enough to justify your conclusions” or “sorry, your results are hopelessly sensitive to a strong assumption for which there is no justification” they’re making a useful contribution to science – even when the grant/paper authors disagree.
joel

Posted February 17, 2013 at 1:10 pm | Permalink

I would like to know what percentage of papers in ml conference use ’empirical process theory and concentration of measure’ in a meaningful way to derive novel insights ? For e.g., just using Hoeffding/Bernstein inequality in place of where a limit theorem could be used should not be counted – sure it gives a non-asymptotic statement, but the fundamental phenomenon going on is more or less the same. I agree there are some good papers (and as you mention – a majority of these end up in journals), but signal-to-noise is too low.
- normaldeviate
  
  Posted February 17, 2013 at 1:57 pm | Permalink
  
  I don’t know the answer to your question
  but I do think you could ask the same question about
  papers in statistics journals.
  - joel
    
    Posted February 17, 2013 at 2:05 pm | Permalink
    
    Yeah, I agree with you.. which is why I was surprised to see you mention about ml conferences..
    I would much rather like to have a system like arxiv + comments (this is whole another issue in itself though) or a sophisticated version of such a system..
  - normaldeviate
    
    Posted February 17, 2013 at 3:48 pm | Permalink
    
    good point
shenting

Posted February 18, 2013 at 3:53 am | Permalink

A (small) typo (I think?): in the Acknowledgements section, the first Tibshirani is mis-spelt.
- shenting
  
  Posted February 18, 2013 at 4:21 am | Permalink
  
  A friend of mine points out another in section 1.7: “The goal of a graduate student in Statistics is to find and advisor”
  
  We both really enjoyed the paper though!
- normaldeviate
  
  Posted February 18, 2013 at 9:01 am | Permalink
  
  Thanks
  - Emre
    
    Posted February 18, 2013 at 4:07 pm | Permalink
    
    Thanks for the really good article.
    There is an another typo in the second to last paragraph: “Potserior distributions”.
  - normaldeviate
    
    Posted February 18, 2013 at 4:09 pm | Permalink
    
    Thanks
jim

Posted February 21, 2013 at 3:27 am | Permalink

If you would be able to say much about robotics, think about how easy it is for a statistician to write something about quantum mechanics or computing. From your point of view ML and statistics may be similar but in reality they are not, not at all. If it’s so easy to switch fields, that means a more profound science supports each. This science is mathematical statistics. There, you will find proofs of theorems as much as in ML correctness of certain algorithms is proven or studied. If you refer in your comparison to the statistics of machine learning, then that would be just the statistics relevant to machine learning and not the statistics relevant to for example quantum mechanics, which pretty much proves my point.
Yisong Yue

Posted February 23, 2013 at 7:08 pm | Permalink

Machine Learning is typically viewed as part of Computer Science, which is more often than not considered an engineering discipline. I think this has a strong influence on what’s considered “exciting” research, e.g., figuring out what to do with new types of data, making models scalable, etc.
Mark N.

Posted March 1, 2013 at 11:43 pm | Permalink

I don’t know if it’s a productive comparison, but I couldn’t help mentally reading this aside Leo Breiman’s article “Statistical modeling: The two cultures” (Statistical Science 16(3), 2001), which to some extent is discussing the same ML/statistics disciplinary split. You mention very different things, though. I wonder to what extent you think any of his views are still accurate (if you think they were at the time, for that matter)? His main focus seems to be on the ML community’s greater interest in predictive vs. descriptive statistics, and their greater willingness to use models that are difficult to do rigorous asymptotic analyses of (Breiman’s own CART perhaps being the canonical “early” example of nonparametric model with good performance but unsatisfying theory). One area you do agree is ML’s more computationalist culture, where the algorithm is the key unit of research, rather than an implementation detail.
Chao

Posted March 22, 2013 at 5:58 am | Permalink

ML students are competing jobs with stats students, I think at least in finance industry, because in that industry (for historical reasons) there are already many “quants” (or “financial engineers”) who are typically physicists, mathematicians, and computer scientists and can use C++/Java etc to develop software (so they are sometimes called ‘software engineers’). We know that ML is usually viewed as a branch of computer science, so this makes no surprise that the same bunch of people want to use their existing skills in a ‘big data’ era. The difference is that previously “quants” did not develop models based on data (in the same sense that Newton’s laws of motion are not developed based on data). Now availability of data seems improving and they want to make use of the data, so they just “reinvent the wheel”. So this may be just the ‘aftereffect’ of traditional quants. Statistics can be promising if applied statisticians can seize this opportunity to make an impact in the industry – you have to wonder since economics and finance are the two disciplines that so close to each other why the long-established applied statisticians in economics (i.e. econometricians) did not take the ‘quants’ role but left the opportunities to physicists.

One Trackback

By The Terminator, Statistics, Fair Use and Hedge Funds « Normal Deviate on October 15, 2013 at 6:48 pm

[…] this year, I posted my contribution to the collection celebrating the 50th anniversary of The Committee of Presidents […]

Normal Deviate