– by Larry Wasserman
What is the biggest open problem in statistics or machine learning?
This question has come up a number of times in discussions I have had
with various people. (Most recently, it came up when I met with students and post-docs at CSML
where I had a very pleasant recent visit.)
In many fields it’s easy to come up with candidates. Some examples are:
Computer Science: Prove that P is not equal to NP.
Pure Math: Prove the Riemann hypothesis.
Applied Math: Solve the Navier-Stokes existence and smoothness problem.
You can probably think of other choices, but my point is that there ARE some obvious choices.
There are plenty of unsolved problems in statistics and machine
learning. Indeed, wikipedia has a page called Unsolved problems in Statistics.
I admire the author of this wikipedia page for even attempting this. It is not his or her fault that the list is pathetic. Some examples
on the list include: the admissibility of the Graybill-Deal estimator (boring), how to detect and correct for systematic errors (important
but vague), the sunrise problem: What is the probability that the sun will rise tomorrow? (you’ve got to be kidding). I’m sure we could
come up with more interesting problems of a more technical nature. But I can’t think of one problem that satisfies the following
1. It can be stated succinctly.
2. You can quickly explain it to a non-specialist and they will get the gist
of the problem.
3. It is sexy. In other words, when you explain it to someone they
think it is cool, even if they don’t know what you are talking about.
Having sexy open problems is more than just an amusement; it helps raise
the profile of the field.
Can anyone suggest a good open problem that satisfies the three criteria?