Theorem 3.11 in his thesis gives simultaneous confidence bands (based on the ball/ellipsoid in the RKHS). I will check out the paper, thanks for the reference.

]]>Thanks for the pointer

I’ll check it out

Thanks! Any hints or thoughts on the source or the magnitude of the bias?

On the complication: I think the procedure was partly motivated by the preference to work on collapsed/aggregate data (much, much faster), namely frequencies by bins. You would say those are not sufficient statistics of the density, or at least the bootstrap procedure wrongly introduces within-bin correlation? (Though if all the matters is the frequency in the bin, how bad is that?)

]]>That seems unnecessary and kind of complicated.

And it would add additional bias.

The regular bootstrap as I described should

work fine.

But if one is interested in a property of the density (e.g. excess mass at/around a prescpecified point) is it wrong to calculate frequencies by bins, fit some flexible parametric form, and bootstrap with resampling the error, where the error is the discrepancy between the frequency in the bin vs the predicted density? This seems to be what Chetty et al. describe on page 23 (p. 25 of the PDF) of an influential paper [1] with their posted code [2] used in other papers since.

Does this “macro level”, aggregate (block?) bootstrap do something different from resampling the empirical CDF? In any case, is the error in the parametric fit of the density the only noise we care about in the problem if we think that individual values (not frequencies) themselves have noise in them?

I also forked this comment into a question on Cross Validate [3]. Also note the related question [4] about whether the estimand is regular enough to bootstrap or subsample in the first place, and with what rate of convergence (motivated by the Normal Deviate posts from January).

[1]: http://obs.rc.fas.harvard.edu/chetty/denmark_adjcost_nber.pdf

[2]: http://obs.rc.fas.harvard.edu/chetty/bunch_count.zip

[3]: http://stats.stackexchange.com/q/69307/6534

[4]: http://stats.stackexchange.com/questions/67613/is-excess-mass-estimation-smooth-enough-to-bootstrap-at-what-rate-might-a-bunch

yes it is easy to get an exact confidence interval for the median.

But this is unrelated to function estimation.

Exactly the same problem happens in nonpar regression. It’s the same analysis.

You can’t get correct bands in a sobolev space without undersmoothing.

Furthermore, you can’t even adapt to the size of the Sobolev ball using the data.

See:

Mark Low (1997) Annals of Statistics, p 2547

From a quick look, it appears to me that your student constructed a confidence ball.

This does not give a band.