The Annals of Statistics

Consistency of Bayes estimators of a binary regression function

Marc Coram and Steven P. Lalley

Full-text: Open access

Abstract

When do nonparametric Bayesian procedures “overfit”? To shed light on this question, we consider a binary regression problem in detail and establish frequentist consistency for a certain class of Bayes procedures based on hierarchical priors, called uniform mixture priors. These are defined as follows: let ν be any probability distribution on the nonnegative integers. To sample a function f from the prior πν, first sample m from ν and then sample f uniformly from the set of step functions from [0,1] into [0,1] that have exactly m jumps (i.e., sample all m jump locations and m+1 function values independently and uniformly). The main result states that if a data-stream is generated according to any fixed, measurable binary-regression function f0≢1/2, then frequentist consistency obtains: that is, for any ν with infinite support, the posterior of πν concentrates on any L1 neighborhood of f0. Solution of an associated large-deviations problem is central to the consistency proof.

Article information

Source
Ann. Statist., Volume 34, Number 3 (2006), 1233-1269.

Dates
First available in Project Euclid: 10 July 2006

Permanent link to this document
https://projecteuclid.org/euclid.aos/1152540748

Digital Object Identifier
doi:10.1214/009053606000000236

Mathematical Reviews number (MathSciNet)
MR2278357

Zentralblatt MATH identifier
1113.62006

Subjects
Primary: 62A15 62E20: Asymptotic distribution theory

Keywords
Consistency Bayes procedure binary regression large deviations subadditivity

Citation

Coram, Marc; Lalley, Steven P. Consistency of Bayes estimators of a binary regression function. Ann. Statist. 34 (2006), no. 3, 1233--1269. doi:10.1214/009053606000000236. https://projecteuclid.org/euclid.aos/1152540748


Export citation

References

  • Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536--561.
  • Chi, Z. (2001). Stochastic sub-additivity approach to the conditional large deviation principle. Ann. Probab. 29 1303--1328.
  • Coram, M. (2002). Nonparametric Bayesian classification. Ph.D. dissertation, Stanford Univ.
  • Diaconis, P. and Freedman, D. (1986). On inconsistent Bayes estimates of location. Ann. Statist. 14 68--87.
  • Diaconis, P. and Freedman, D. A. (1993). Nonparametric binary regression: A Bayesian approach. Ann. Statist. 21 2108--2137.
  • Diaconis, P. and Freedman, D. A. (1995). Nonparametric binary regression with random covariates. Probab. Math. Statist. 15 243--273.
  • Freedman, D. A. (1963). On the asymptotic behavior of Bayes' estimates in the discrete case. Ann. Math. Statist. 34 1386--1403.
  • Freedman, D. and Diaconis, P. (1983). On inconsistent Bayes estimates in the discrete case. Ann. Statist. 11 1109--1118.
  • Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500--531.
  • Ghosh, J. K. and Ramamoorthi, R. V. (2003). Bayesian Nonparametrics. Springer, New York.
  • Hammersley, J. M. (1962). Generalization of the fundamental theorem on sub-additive functions. Proc. Cambridge Philos. Soc. 58 235--238.
  • Kaijser, T. (1975). A limit theorem for partially observed Markov chains. Ann. Probab. 3 677--696.
  • Karlsson, A. and Margulis, G. A. (1999). A multiplicative ergodic theorem and nonpositively curved spaces. Comm. Math. Phys. 208 107--123.
  • Kieffer, J. C. (1973). A counterexample to Perez's generalization of the Shannon--McMillan theorem. Ann. Probab. 1 362--364.
  • Kieffer, J. C. (1974). A simple proof of the Moy--Perez generalization of the Shannon--McMillan theorem. Pacific J. Math. 51 203--206.
  • Kingman, J. F. C. (1973). Subadditive ergodic theory (with discussion). Ann. Probab. 1 883--909.
  • Liggett, T. M. (1985). An improved subadditive ergodic theorem. Ann. Probab. 13 1279--1285.
  • Perez, A. (1964). Extensions of Shannon--McMillan's limit theorem to more general stochastic processes. In Trans. Third Prague Conference on Information Theory, Statistical Decision Functions, Random Processes (Liblice, 1962) 545--574. Publ. House Czech. Acad. Sci., Prague.
  • Perez, A. (1980). On Shannon--McMillan's limit theorem for pairs of stationary random processes. Kybernetika (Prague) 16 301--314.
  • Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
  • Ruelle, D. (1999). Statistical Mechanics. Rigorous Results. World Scientific, River Edge, NJ.
  • Schwartz, L. (1965). On Bayes procedures. Z. Wahrsch. Verw. Gebiete 4 10--26.
  • Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions. Ann. Statist. 29 687--714.
  • Stein, E. M. (1970). Singular Integrals and Differentiability Properties of Functions. Princeton Univ. Press.
  • Walker, S. G. (2004). New approaches to Bayesian consistency. Ann. Statist. 32 2028--2043.
  • Walker, S. G. (2004). Modern Bayesian asymptotics. Statist. Sci. 19 111--117.
  • Walters, P. (1982). An Introduction to Ergodic Theory. Springer, New York.