Since I am working on revising of a paper (always an annoying task) this seems like a good time for rant. In particular, I want to complain about the perfunctory example.
1. Perfunctory Examples
Imagine this: you submit a paper to a journal (or a conference). The reviews are positive. But at least one reviewer feels compelled to add: “please add a real example” or “please add more simulations.”
Now, if adding an example or another simulation would truly make the paper better, then fine. But it’s my experience that the “add a real example” reaction is more of a reflex action. The reviewer has not asked himself or herself: does this paper really need an example (or another simulation)?
The result of these requests for examples is this: our journals are full of papers with perfunctory examples and simulations that add nothing to the paper. Worse, researchers waste weeks of precious time satisfying the arbitrary whims of a reviewer.
Let me repeat: I do think examples and simulations can sometimes make a paper better. But often not. It is just a waste of time. There just seems to be a basic, unquestioned assumption that a paper with a “real example” is publishable while a paper without one is not.
2. The Role of Applications
Now I want to clarify that I do think applications play a critical role in statistics and machine learning. In my opinion, theory is interesting when it is motivated, even indirectly, by real applications. (Just as theoretical physics is ultimately based on trying to understand the world we live in.)
Kiri Wagstaff has a nice paper called Machine Learning That Matters at ICML. Her abstract begins Much of current machine learning (ML) research has lost its connection to problems of import to the larger world of science and society. This is very interesting to me because many years ago (perhaps in the 1980′s?) similar discussions took place in the field of statistics. I think there was a general consensus that statistics was putting too much emphasis on theory for its own sake and needed to get back to its roots.
The problem is this: while I do think that good theory is ultimately rooted in applications, the link between applications and theory can be subtle. Solving applied problems leads to the development of new methods. Then people start to ask questions about the behavior of these methods. This leads to new theory. But the chain linking theory to applications can be a long, complex, very non-linear route. Simply adding an example into a paper is not going to illuminate the connection.
To me, theory is interesting if it explains, creates or casts some light on methodology. Judging whether theory is interesting is subtle and, yes, also a matter of taste. But there is a temptation to reduce it to: is there an example in the paper?
How can we make sure that statistics and machine learning stay anchored in the real world without simply requiring that people add some bogus example to their papers? Kiri Wagstaff offers some ideas, including six impact challenges. I’d like to know what other people think about it.
One possible answer is: it doesn’t matter. People should work on, and publish, whatever they want. On average, good theory will get noticed and used; less useful theory will attract less attention. In other words, let the field find its own direction. We shouldn’t try to direct it. (This happens to be my opinion but I suspect many will disagree.)
In the meantime, let’s have a moratorium on reviewers asking for more examples and simulations, just for the sake of it.
And to borrow from Dennis Miller, the master of rants, I’ll conclude with:
That’s just my opinion, I could be wrong.