## Statistical Science

### New Important Developments in Small Area Estimation

Danny Pfeffermann

#### Abstract

The problem of small area estimation (SAE) is how to produce reliable estimates of characteristics of interest such as means, counts, quantiles, etc., for areas or domains for which only small samples or no samples are available, and how to assess their precision. The purpose of this paper is to review and discuss some of the new important developments in small area estimation methods. Rao [Small Area Estimation (2003)] wrote a very comprehensive book, which covers all the main developments in this topic until that time. A few review papers have been written after 2003, but they are limited in scope. Hence, the focus of this review is on new developments in the last 7–8 years, but to make the review more self-contained, I also mention shortly some of the older developments. The review covers both design-based and model-dependent methods, with the latter methods further classified into frequentist and Bayesian methods. The style of the paper is similar to the style of my previous review on SAE published in 2002, explaining the new problems investigated and describing the proposed solutions, but without dwelling on theoretical details, which can be found in the original articles. I hope that this paper will be useful both to researchers who like to learn more on the research carried out in SAE and to practitioners who might be interested in the application of the new methods.

#### Article information

Source
Statist. Sci. Volume 28, Number 1 (2013), 40-68.

Dates
First available in Project Euclid: 29 January 2013

http://projecteuclid.org/euclid.ss/1359468408

Digital Object Identifier
doi:10.1214/12-STS395

Mathematical Reviews number (MathSciNet)
MR3075338

#### Citation

Pfeffermann, Danny. New Important Developments in Small Area Estimation. Statist. Sci. 28 (2013), no. 1, 40--68. doi:10.1214/12-STS395. http://projecteuclid.org/euclid.ss/1359468408.

#### References

• Battese, G. E., Harter, R. M. and Fuller, W. A. (1988). An error components model for prediction of county crop area using survey and satellite data. J. Amer. Statist. Assoc. 83 28–36.
• Bayarri, M. J. and Castellanos, M. E. (2007). Bayesian checking of the second levels of hierarchical models. Statist. Sci. 22 322–343.
• Bell, W. R. and Huang, E. T. (2006). Using the $t$-distribution to deal with outliers in small area estimation. In Proceedings of Statistics Canada Symposium on Methodological Issues in Measuring Population Health. Statistics Canada, Ottawa, Canada.
• Chambers, R., Chandra, H. and Tzavidis, N. (2011). On bias-robust mean squared error estimation for pseudo-linear small are estimators. Survey Methodology 37 153–170.
• Chambers, R. and Tzavidis, N. (2006). $M$-quantile models for small area estimation. Biometrika 93 255–268.
• Chandra, H. and Chambers, R. (2009). Multipurpose small area estimation. Journal of Official Statistics 25 379–395.
• Chatterjee, S., Lahiri, P. and Li, H. (2008). Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. Ann. Statist. 36 1221–1245.
• Chaudhuri, S. and Ghosh, M. (2011). Empirical likelihood for small area estimation. Biometrika 98 473–480.
• Chen, S. and Lahiri, P. (2002). On mean squared prediction error estimation in small area estimation problems. In Proceedings of the Survey Research Methods Section 473–477. American Statistical Association, Alexandria, VA.
• Das, K., Jiang, J. and Rao, J. N. K. (2004). Mean squared error of empirical predictor. Ann. Statist. 32 818–840.
• Datta, G. S. (2009). Model-based approach to small area estimation. In Sample Surveys: Inference and Analysis, (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 251–288. North-Holland, Amsterdam.
• Datta, G. S., Hall, P. and Mandal, A. (2011). Model selection by testing for the presence of small-area effects, and application to area-level data. J. Amer. Statist. Assoc. 106 362–374.
• Datta, G. S. and Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statist. Sinica 10 613–627.
• Datta, G. S., Rao, J. N. K. and Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92 183–196.
• Datta, G. S., Rao, J. N. K. and Torabi, M. (2010). Pseudo-empirical Bayes estimation of small area means under a nested error linear regression model with functional measurement errors. J. Statist. Plann. Inference 140 2952–2962.
• Datta, G. S., Ghosh, M., Steorts, R. and Maples, J. (2011). Bayesian benchmarking with applications to small area estimation. Test 20 574–588.
• Dey, D. K., Gelfand, A. E., Swartz, T. B. and Vlachos, A. K. (1998). A simulation-intensive approach for checking hierarchical models. Test 7 325–346.
• Estevao, V. M. and Särndal, C. E. (2004). Borrowing strength is not the best technique within a wide class of design-consistent domain estimators. Journal of Official Statistics 20 645–669.
• Estevao, V. M. and Särndal, C. E. (2006). Survey estimates by calibration on complex auxiliary information. International Statistical Review 74 127–147.
• Falorsi, P. D. and Righi, P. (2008). A balanced sampling approach for multi-way stratification designs for small area estimation. Survey Methodology 34 223–234.
• Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of James–Stein procedures to census data. J. Amer. Statist. Assoc. 74 269–277.
• Ganesh, N. and Lahiri, P. (2008). A new class of average moment matching priors. Biometrika 95 514–520.
• Ghosh, M., Maiti, T. and Roy, A. (2008). Influence functions and robust Bayes and empirical Bayes small area estimation. Biometrika 95 573–585.
• Ghosh, M. and Rao, J. N. K. (1994). Small area estimation: An appraisal (with discussion). Statist. Sci. 9 65–93.
• Ghosh, M., Sinha, K. and Kim, D. (2006). Empirical and hierarchical Bayesian estimation in finite population sampling under structural measurement error models. Scand. J. Statist. 33 591–608.
• Ghosh, M. and Sinha, K. (2007). Empirical Bayes estimation in finite population sampling under functional measurement error models. J. Statist. Plann. Inference 137 2759–2773.
• Ghosh, M., Natarajan, K., Stroud, T. W. F. and Carlin, B. P. (1998). Generalized linear models for small-area estimation. J. Amer. Statist. Assoc. 93 273–282.
• Gurka, M. J. (2006). Selecting the best linear mixed model under REML. Amer. Statist. 60 19–26.
• Hall, P. and Maiti, T. (2006). On parametric bootstrap methods for small area prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 221–238.
• Huang, E. T. and Bell, W. R. (2006). Using the $t$-distribution in small area estimation: An application to SAIPE state poverty models. In Proceedings of the Survey Research Methods Section 3142–3149. American Statistical Association, Alexandria, VA.
• Jiang, J., Lahiri, P. and Wan, S. M. (2002). A unified jackknife theory for empirical best prediction with $M$-estimation. Ann. Statist. 30 1782–1810.
• Jiang, J. and Lahiri, P. (2006a). Estimation of finite population domain means: A model-assisted empirical best prediction approach. J. Amer. Statist. Assoc. 101 301–311.
• Jiang, J. and Lahiri, P. (2006b). Mixed model prediction and small area estimation. Test 15 1–96.
• Jiang, J., Nguyen, T. and Rao, J. S. (2010). Fence method for non-parametric small area estimation. Survey Methodology 36 3–11.
• Jiang, J., Nguyen, T. and Rao, J. S. (2011). Best predictive small area estimation. J. Amer. Statist. Assoc. 106 732–745.
• Jiang, J., Rao, J. S., Gu, Z. and Nguyen, T. (2008). Fence methods for mixed model selection. Ann. Statist. 36 1669–1692.
• Judkins, D. R. and Liu, J. (2000). Correcting the bias in the range of a statistic across small areas. Journal of Official Statist. 16 1–13.
• Kott, P. S. (2009). Calibration weighting: Combining probability samples and linear prediction models. In Sample Surveys: Inference and Analysis, (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 55–82. North-Holland, Amsterdam.
• Lehtonen, R., Särndal, C. E. and Veijanen, A. (2003). The effect of model choice in estimation for domains, including small domains. Survey Methodology 29 33–44.
• Lehtonen, R., Särndal, C. E. and Veijanen, A. (2005). Does the model matter? Comparing model-assisted and model-dependent estimators of class frequencies for domains. Statistics in Transition 7 649–673.
• Lehtonen, R. and Veijanen, A. (2009). Design-based methods of estimation for domains and small areas. In Sample Surveys: Inference and Analysis (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 219–249. North-Holland, Amsterdam.
• Lohr, S. L. and Rao, J. N. K. (2009). Jackknife estimation of mean squared error of small area predictors in nonlinear mixed models. Biometrika 96 457–468.
• Macgibbon, B. and Tomberlin, T. J. (1989). Small area estimates of proportions via empirical Bayes techniques. Survey Methodology 15 237–252.
• Malec, D., Davis, W. W. and Cao, X. (1999). Model-based small area estimates of overweight prevalence using sample selection adjustment. Stat. Med. 18 3189–3200.
• Malinovsky, Y. and Rinott, Y. (2010). Prediction of ordered random effects in a simple small area model. Statist. Sinica 20 697–714.
• Mohadjer, L., Rao, J. N. K., Liu, B., Krenzke, T. and Van De Kerckhove, W. (2007). Hierarchical Bayes small area estimates of adult literacy using unmatched sampling and linking models. In Proceedings of the Survey Research Methods Section 3203–3210. American Statistical Association, Alexandria, VA.
• Molina, I. and Rao, J. N. K. (2010). Small area estimation of poverty indicators. Canad. J. Statist. 38 369–385.
• Nandram, B. and Choi, J. W. (2010). A Bayesian analysis of body mass index data from small domains under nonignorable nonresponse and selection. J. Amer. Statist. Assoc. 105 120–135.
• Nandram, B. and Sayit, H. (2011). A Bayesian analysis of small area probabilities under a constraint. Survey Methodology 37 137–152.
• Opsomer, J. D., Claeskens, G., Ranalli, M. G., Kauermann, G. and Breidt, F. J. (2008). Non-parametric small area estimation using penalized spline regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 265–286.
• Pan, Z. and Lin, D. Y. (2005). Goodness-of-fit methods for generalized linear mixed models. Biometrics 61 1000–1009.
• Pfeffermann, D. (2002). Small area estimation—new developments and directions. International Statistical Review 70 125–143.
• Pfeffermann, D. and Correa, S. (2012). Empirical bootstrap bias correction and estimation of prediction mean square error in small area estimation. Biometrika 99 457–472.
• Pfeffermann, D. and Sverchkov, M. (2007). Small-area estimation under informative probability sampling of areas and within the selected areas. J. Amer. Statist. Assoc. 102 1427–1439.
• Pfeffermann, D., Terryn, B. and Moura, F. A. S. (2008). Small area estimation under a two-part random effects model with application to estimation of literacy in developing countries. Survey Methodology 34 235–249.
• Pfeffermann, D. and Tiller, R. (2006). Small-area estimation with state-space models subject to benchmark constraints. J. Amer. Statist. Assoc. 101 1387–1397.
• Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators. J. Amer. Statist. Assoc. 85 163–171.
• Rao, J. N. K. (2003). Small Area Estimation. Wiley, Hoboken, NJ.
• Rao, J. N. K. (2005). Inferential issues in small area estimation: Some new developments. Statistics in Transition 7 513–526.
• Rao, J. N. K. (2008). Some methods for small area estimation. Revista Internazionale di Siencze Sociali 4 387–406.
• Rao, J. N. K., Sinha, S. K. and Roknossadati, M. (2009). Robust small area estimation using penalized spline mixed models. In Proceedings of the Survey Research Methods Section 145-153. American Statistical Association, Alexandria, VA.
• Sinha, S. K. and Rao, J. N. K. (2009). Robust small area estimation. Canad. J. Statist. 37 381–399.
• Torabi, M., Datta, G. S. and Rao, J. N. K. (2009). Empirical Bayes estimation of small area means under a nested error linear regression model with measurement errors in the covariates. Scand. J. Stat. 36 355–368.
• Torabi, M. and Rao, J. N. K. (2008). Small area estimation under a two-level model. Survey Methodology 34 11–17.
• Tzavidis, N., Marchetti, S. and Chambers, R. (2010). Robust estimation of small-area means and quantiles. Aust. N. Z. J. Stat. 52 167–186.
• Ugarte, M. D., Militino, A. F. and Goicoa, T. (2009). Benchmarked estimates in small areas using linear mixed models with restrictions. TEST 18 342–364.
• Vaida, F. and Blanchard, S. (2005). Conditional Akaike information for mixed-effects models. Biometrika 92 351–370.
• Wang, J., Fuller, W. A. and Qu, Y. (2008). Small area estimation under a restriction. Survey Methodology 34 29–36.
• Wright, D. L., Stern, H. S. and Cressie, N. (2003). Loss functions for estimation of extrema with an application to disease mapping. Canad. J. Statist. 31 251–266.
• Yan, G. and Sedransk, J. (2007). Bayesian diagnostic techniques for detecting hierarchical structure. Bayesian Anal. 2 735–760.
• Yan, G. and Sedransk, J. (2010). A note on Bayesian residuals as a hierarchical model diagnostic technique. Statist. Papers 51 1–10.
• Ybarra, L. M. R. and Lohr, S. L. (2008). Small area estimation when auxiliary information is measured with error. Biometrika 95 919–931.
• You, Y. and Rao, J. N. K. (2002). A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Canad. J. Statist. 30 431–439.
• Zhang, L. C. (2009). Estimates for small area compositions subjected to informative missing data. Survey Methodology 35 191–201.
• Zhang, L. C. and Chambers, R. L. (2004). Small area estimates for cross-classifications. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 479–496.