When do nonparametric Bayesian procedures “overfit”? To shed light on this question, we consider a binary regression problem in detail and establish frequentist consistency for a certain class of Bayes procedures based on hierarchical priors, called uniform mixture priors. These are defined as follows: let ν be any probability distribution on the nonnegative integers. To sample a function f from the prior πν, first sample m from ν and then sample f uniformly from the set of step functions from [0,1] into [0,1] that have exactly m jumps (i.e., sample all m jump locations and m+1 function values independently and uniformly). The main result states that if a data-stream is generated according to any fixed, measurable binary-regression function f0≢1/2, then frequentist consistency obtains: that is, for any ν with infinite support, the posterior of πν concentrates on any L1 neighborhood of f0. Solution of an associated large-deviations problem is central to the consistency proof.
"Consistency of Bayes estimators of a binary regression function." Ann. Statist. 34 (3) 1233 - 1269, June 2006. https://doi.org/10.1214/009053606000000236