The Annals of Statistics

Nonparametric hierarchical Bayes via sequential imputations

Jun S. Liu

Full-text: Open access


We consider the empirical Bayes estimation of a distribution using binary data via the Dirichlet process. Let $\mathscr{D}(\alpha)$ denote a Dirichlet process with $\alpha$ being a finite measure on Instead of having direct samples from an unknown random distribution F from $\mathscr{D}(\alpha)$, we assume that only indirect binomial data are observable. This paper presents a new interpretation of Lo's formula, and thereby relates the predictive density of the observations based on a Dirichlet process model to likelihoods of much simpler models. As a consequence, the log-likelihood surface, as well as the maximum likelihood estimate of $c = \alpha([0, 1])$, is found when the shape of $\alpha$ a is assumed known, together with a formula for the Fisher information evaluated at the estimate. The sequential imputation method of Kong, Liu and Wong is recommended for overcoming computational difficulties commonly encountered in this area. The related approximation formulas are provided. An analysis of the tack data of Beckett and Diaconis, which motivated this study, is supplemented to illustrate our methods.

Article information

Ann. Statist. Volume 24, Number 3 (1996), 911-930.

First available in Project Euclid: 20 September 2002

Permanent link to this document

Mathematical Reviews number (MathSciNet)

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62E25 65U05

Dirichlet process empirical Bayes Gibbs sampler importance sampling Pólya urn sensitivity analysis


Liu, Jun S. Nonparametric hierarchical Bayes via sequential imputations. Ann. Statist. 24 (1996), no. 3, 911--930. doi:10.1214/aos/1032526949.

Export citation


  • Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2 1152-1174.
  • Beckett, L. and Diaconis, P. (1994). Spectral analysis for discrete longitudinal data. Adv. Math. 103 107-128.
  • Berry, D. and Christensen, R. (1979). Empirical Bay es estimation of a binomial parameter via mixture of Dirichlet processes. Ann. Statist. 7 558-568.
  • Blackwell, D. and MacQueen, J. B. (1973). Ferguson distributions via P´oly a urn schemes. Ann. Statist. 1 353-355.
  • Box, G. E. P. (1980). Sampling and Bay es' inference in scientific modeling and robustness (with discussion). J. Roy. Statist. Soc. Ser. A 143 383-430.
  • Chernoff, H. (1994). Personal communication.
  • Doksum, K. A. (1972). Decision theory for some nonparametric models. Proc. Sixth Berkeley Sy mp. Math. Statist. Probab. 1 331-343. Univ. California Press, Berkeley.
  • Doss, H. (1994). Bayesian nonparametric estimation for incomplete data via successive substitution sampling. Ann. Statist. 22 1763-1786.
  • Efron, B. and Morris, C. (1975). Data analysis using Stein's estimator and its generalizations. J. Amer. Statist. Assoc. 70 311-319.
  • Escobar, M. D. (1994). Estimating normal means with a Dirichlet process prior. J. Amer. Statist. Assoc. 89 268-277.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209-230.
  • Ferguson, T. S. (1974). Prior distribution on space of probability measures. Ann. Statist. 2 615-629.
  • Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398-409.
  • Gey er, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data (with discussion). J. Roy. Statist. Soc. Ser. B 54 657-699.
  • Kong, A., Liu, J. S. and Wong, W. H. (1994). Sequential imputations and Bayesian missing data problems. J. Amer. Statist. Assoc. 89 278-288.
  • Korwar, R. M. and Hollander, M. (1973). Contributions to the theory of Dirichlet processes. Ann. Statist. 1 705-711.
  • Kuo, L. (1986). Computations of mixtures of Dirichlet processes. SIAM J. Sci. Statist. Comput. 7 60-71.
  • Laird, N. (1978). Nonparametric maximum likelihood estimation of a mixing distribution. J. Amer. Statist. Assoc. 73 805-811.
  • Liu, J. S. (1994). The collapsed Gibbs sampler in Bayesian computations with application to a gene regulation problem. J. Amer. Statist. Assoc. 89 958-966.
  • Liu, J. S. (1996). Metropolized independent sampling scheme with comparisons to rejection sampling and importance sampling. Statist. Comput. 6. To appear.
  • Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81 27-40.
  • Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates. I. Density estimates. Ann. Statist. 12 351-357.
  • MacEachern, S. M. (1994). Estimating normal means with a conjugate sty le Dirichlet process prior. Comm. Statist. Simulation Comput. 23 727-741.
  • Smith, A. F. M. and Roberts, G. O. (1993). Bayesian computation via the Gibbs samples and related Markov chain Monte Carlo methods. J. Roy. Statist. Soc. Ser. B 55 3-23.
  • Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). J. Amer. Statist. Assoc. 82 528-550.
  • West, M. (1992). Hy perparameter estimation in Dirichlet process mixture models. ISDS Discussion Paper 92-A03, Duke Univ.