Bayesian Analysis

A Generalised Semiparametric Bayesian Fay–Herriot Model for Small Area Estimation Shrinking Both Means and Variances

Silvia Polettini

Full-text: Open access

Abstract

In survey sampling, interest often lies in unplanned domains (or small areas), whose sample sizes may be too small to allow for accurate design-based inference. To improve the direct estimates by borrowing strength from similar domains, most small area methods rely on mixed effects regression models.

This contribution extends the well known Fay–Herriot model (Fay and Herriot, 1979) within a Bayesian approach in two directions. First, the default normality assumption for the random effects is replaced by a nonparametric specification using a Dirichlet process. Second, uncertainty on variances is explicitly introduced, recognizing the fact that they are actually estimated from survey data. The proposed approach shrinks variances as well as means, and accounts for all sources of uncertainty. Adopting a flexible model for the random effects allows to accommodate outliers and vary the borrowing of strength by identifying local neighbourhoods where the exchangeability assumption holds. Through application to real and simulated data, we investigate the performance of the proposed model in predicting the domain means under different distributional assumptions. We also focus on the construction of credible intervals for the area means, a topic that has received less attention in the literature. Frequentist properties such as mean squared prediction error (MSPE), coverage and interval length are investigated. The experiments performed seem to indicate that inferences under the proposed model are characterised by smaller mean squared error than competing approaches; frequentist coverage of the credible intervals is close to nominal.

Article information

Source
Bayesian Anal., Volume 12, Number 3 (2017), 729-752.

Dates
First available in Project Euclid: 7 September 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1473276257

Digital Object Identifier
doi:10.1214/16-BA1019

Mathematical Reviews number (MathSciNet)
MR3655874

Zentralblatt MATH identifier
1384.62271

Keywords
Dirichlet process prior Fay–Herriot Hierarchical models mixed effects regression models small area smoothing of sampling variances

Rights
Creative Commons Attribution 4.0 International License.

Citation

Polettini, Silvia. A Generalised Semiparametric Bayesian Fay–Herriot Model for Small Area Estimation Shrinking Both Means and Variances. Bayesian Anal. 12 (2017), no. 3, 729--752. doi:10.1214/16-BA1019. https://projecteuclid.org/euclid.ba/1473276257


Export citation

References

  • Antoniak, C. E. (1974). “Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems.”Annals of Statistics, 2: 1152–1174.
  • Arora, V. and Lahiri, P. (1997). “On the superiority of the Bayesian method over the BLUP in small area estimation problems.” Statistica Sinica, 7: 1053–1064.
  • Articus, C. and Burgard, J. P. (2014). “A Finite Mixture Fay Herriot-type model for estimating regional rental prices in Germany.” Technical report, University of Trier, Department of Economics.
  • Azzalini, A. and Capitanio, A. (2003). “Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2): 367–389.
  • Battese, G. E., Harter, R. M., and Fuller, W. A. (1988). “An error-components model for prediction of county crop areas using survey and satellite data.” Journal of the American Statistical Association, 83: 28–36.
  • Bell, W. R. (2008). “Examining Sensitivity of Small Area Inferences to Uncertainty About Sampling Error Variances.” In ASA Proceedings of the Section on Survey Research, 871–876. John Wiley.
  • Blackwell, D. and MacQueen, J. B. (1973). “Ferguson distributions via Pólya urn schemes.” Annals of Statistics, 1: 353–355.
  • Celeux, G., Forbes, F., Robert, C. P., and Titterington, D. M. (2006). “Deviance information criteria for missing data models.” Bayesian Analysis, 1(4): 651–673.
  • Chakraborty, A., Datta, G. S., and Mandal, A. (2016). “A two-component normal mixture alternative to the Fay–Herriot model.” Statistics in Transition new series, 17(1): 67–90.
  • Dass, S. C., Maiti, T., Ren, H., and Sinha, S. (2012). “Confidence interval estimation of small area parameters shrinking both means and variances.” Survey Methodology, 38: 173–187.
  • Datta, G. and Ghosh, M. (2012). “Small area shrinkage estimation.” Statistical Science, 27(1): 95–114.
  • Datta, G. S. and Lahiri, P. (1995). “Robust hierarchical Bayes estimation of small area characteristics in the presence of covariates and outliers.” Journal of Multivariate Analysis, 54(2): 310–328.
  • Datta, G. S. and Lahiri, P. (2000). “A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems.” Statistica Sinica, 10(2): 613–627.
  • Diao, L., Smith, D. D., Datta, G. S., Maiti, T., and Opsomer, J. D. (2014). “Accurate confidence interval estimation of small area parameters under the Fay–Herriot model.” Scandinavian Journal of Statistics, 41(2): 497–515.
  • Dick, P. (1995). “Modelling net undercoverage in the 1991 Canadian Census.” Survey Methodology, 21: 45–54.
  • Dorazio, R. M. (2009). “On selecting a prior for the precision parameter of Dirichlet process mixture models.” Journal of Statistical Planning and Inference, 139(9): 3384–3390.
  • Escobar, M. D. and West, M. (1994). “Bayesian density estimation and inference using mixtures.” Journal of the American Statistical Association, 90: 577–588.
  • Fabrizi, E. and Trivisano, C. (2010). “Robust linear mixed models for Small Area Estimation.” Journal of Statistical Planning and Inference, 140: 433–443.
  • Fay, R. and Herriot, R. (1979). “Estimates of income for small places: an application of James–Stein procedures to Census Data.” Journal of the American Statistical Association, 74: 269–277.
  • Ferguson, T. S. (1973). “A Bayesian Analysis of some nonparametric problems.” Annals of Statistics, 1(2): 209–230.
  • Geisser, S. and Eddy, W. F. (1979). “A predictive approach to model selection.” Journal of the American Statistical Association, 74(365): 153–160.
  • Hawala, S. and Lahiri, P. (2010). “Variance modeling in the U.S. small area income and poverty estimates program for the American community survey.” In Proceedings of the American Statistical Association, Survey Methods Section, Denver, Colorado. Alexandria, VA: American Statistical Association.
  • Liu, J. S. (1996). “Nonparametric hierarchical Bayes via sequential imputations.” Annals of Statistics, 24(3): 911–930.
  • Lo, A. Y. (1984). “On a class of Bayesian nonparametric estimates. I. Density estimates.” Annals of Statistics, 12(1): 351–357.
  • Maiti, T. (2003). “Modelling small area effects using mixture of Gaussians.” Sankhyā: The Indian Journal of Statistics, 65(3): pp. 612–625.
  • Maiti, T., Ren, H., and Sinha, S. (2014). “Prediction error of small area predictors shrinking both means and variances.” Scandinavian Journal of Statistics, 41(3): 775–790.
  • Malec, D. and Müller, P. (2008). A Bayesian semi-parametric model for small area estimation, volume 3 of Collections, 223–236. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
  • Maples, J., Bell, W., and Huang, E. (2009). “Small area variance modeling with application to county poverty estimates from the American community survey.” In Proceedings of the American Statistical Association, Section on Survey Research Methods, Alexandria, VA: American Statistical Association, 5056–5067.
  • McCulloch, C. E. and Neuhaus, J. M. (2011). “Misspecifying the shape of a random effects distribution: why getting it wrong may not matter.” Statistical Science, 26(3): 388–402.
  • Murugiah, S. and Sweeting, T. (2012). “Selecting the precision parameter prior in Dirichlet process mixture models.” Journal of Statistical Planning and Inference, 142(7): 1947–1959.
  • Ohlssen, D. I., Sharples, L. D., and Spiegelhalter, D. J. (2007). “Flexible random-effects models using Bayesian semi-parametric models: Applications to institutional comparisons.” Statistic in Medicine, 26: 2088–2112.
  • Prasad, N. G. N. and Rao, J. N. K. (1990). “The estimation of the mean squared error of small-area estimators.” Journal of the American Statistical Association, 85(409): 163–171.
  • Rao, J. N. K. and Molina, I. (2015). Small Area Estimation. Wiley Series in Survey Methodology. John Wiley & Sons, Inc., Hoboken, NJ, second edition. With a foreword by Graham Kalton.
  • Rivest, L.-P. and Vandal, N. (2003). “Mean squared error estimation for small areas when the small area variances are estimated.” In Roberts, G. and Bellhouse, D. (eds.), Proceedings of the International Conference on Recent Advances in Survey Sampling, ICRASS 2002, Ottawa, Canada. Statistics Canada.
  • Rossi, P. (2014). Bayesian Non-and Semi-parametric Methods and Applications. Princeton University Press.
  • Sinha, S. K. and Rao, J. N. K. (2009). “Robust small area estimation.” Canadian Journal of Statistics, 37(3): 381–399.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). “Bayesian measures of model complexity and fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4): 583–639.
  • Wang, J. and Fuller, W. A. (2003). “The mean squared error of small area predictors constructed with estimated area variances.” Journal of the American Statistical Association, 98(463): 716–723.
  • Watanabe, S. (2009). Algebraic geometry and statistical learning theory, volume 25 of Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge.
  • Watanabe, S. (2010). “Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory.” Journal of Machine Learning Research, 11: 3571–3594.
  • West, M., Müller, P., and Escobar, M. D. (1994). “Hierarchical priors and mixture models, with application in regression and density estimation.” In Freeman, P. R. and Smith, A. F. M. (eds.), Aspects of Uncertainty. A Tribute to D. V. Lindley, 363–386. John Wiley & Sons.
  • You, Y. and Chapman, B. (2006). “Small area estimation using area level models and estimated sampling variances.” Survey Methodology, 32: 97–103.