The Annals of Statistics

Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models

Snigdhansu Chatterjee, Partha Lahiri, and Huilin Li

Source: Ann. Statist. Volume 36, Number 3 (2008), 1221-1245.

Abstract

Empirical best linear unbiased prediction (EBLUP) method uses a linear mixed model in combining information from different sources of information. This method is particularly useful in small area problems. The variability of an EBLUP is traditionally measured by the mean squared prediction error (MSPE), and interval estimates are generally constructed using estimates of the MSPE. Such methods have shortcomings like under-coverage or over-coverage, excessive length and lack of interpretability. We propose a parametric bootstrap approach to estimate the entire distribution of a suitably centered and scaled EBLUP. The bootstrap histogram is highly accurate, and differs from the true EBLUP distribution by only O(d3n−3/2), where d is the number of parameters and n the number of observations. This result is used to obtain highly accurate prediction intervals. Simulation results demonstrate the superiority of this method over existing techniques of constructing prediction intervals in linear mixed models.

Primary Subjects: 62D05
Secondary Subjects: 62F40, 62F25
Keywords: Predictive distribution; prediction interval; linear mixed model; small area; bootstrap; coverage accuracy

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1211819562
Digital Object Identifier: doi:10.1214/07-AOS512
Mathematical Reviews number (MathSciNet): MR2418655
Zentralblatt MATH identifier: 05294971

References

Abramovitch, L. and Singh, K. (1985). Edgeworth corrected pivotal statistics and the bootstrap. Ann. Statist. 13 116–132.
Mathematical Reviews (MathSciNet): MR773156
Digital Object Identifier: doi:10.1214/aos/1176346580
Project Euclid: euclid.aos/1176346580
Aitchison, J. (1975). Goodness of predictive fit. Biometrika 62 547–554.
Mathematical Reviews (MathSciNet): MR391353
Zentralblatt MATH: 0339.62018
Digital Object Identifier: doi:10.1093/biomet/62.3.547
Basu, R., Ghosh, J. K. and Mukerjee, R. (2003). Empirical Bayes prediction intervals in a normal regression model: Higher order asymptotics. Statist. Probab. Lett. 63 197–203.
Mathematical Reviews (MathSciNet): MR1986689
Beran, R. (1990a). Refining bootstrap simultaneous confidence sets. J. Amer. Statist. Assoc. 85 417–426.
Mathematical Reviews (MathSciNet): MR1141742
Digital Object Identifier: doi:10.2307/2289778
Beran, R. (1990b). Calibrating prediction regions. Refining bootstrap simultaneous confidence sets. J. Amer. Statist. Assoc. 85 715–723.
Mathematical Reviews (MathSciNet): MR1138352
Digital Object Identifier: doi:10.2307/2290007
Breiman, L. (1996). Bagging predictors. Machine Learning 24 123–140.
Carlin, B. P. and Louis, T. A. (1996). Bayes and Empirical Bayes Methods for data Analysis. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1427749
Zentralblatt MATH: 0871.62012
Carlin, B. and Gelfand, A. (1990). Approaches for empirical Bayes confidence intervals. J. Amer. Statist. Assoc. 85 105–114.
Mathematical Reviews (MathSciNet): MR1137356
Digital Object Identifier: doi:10.2307/2289531
Carlin, B. and Gelfand, A. (1991). A sample reuse method for accurate parametric empirical Bayes confidence intervals. J. Roy. Statist. Soc. Ser. B 53 189–200.
Chatterjee, S. and Lahiri, P. (2002). Parametric bootstrap confidence intervals in small area estimation problems. Unpublished manuscript.
Cox, D. R. (1975). Prediction intervals and empirical Bayes confidence intervals. In Perspectives in Probability and Statistics. Papers in Honor of M. S. Bartlett (J. Gani, ed.) 47–55. Applied Probability Trust, Univ. Sheffield, Sheffield.
Mathematical Reviews (MathSciNet): MR403046
Zentralblatt MATH: 0356.62028
Das, K., Jiang, J. and Rao, J. N. K. (2004). Mean squared error of empirical predictor. Ann. Statist. 32 818–840.
Mathematical Reviews (MathSciNet): MR2060179
Digital Object Identifier: doi:10.1214/009053604000000201
Project Euclid: euclid.aos/1083178948
Datta, G. S., Ghosh, M., Smith, D. and Lahiri, P. (2002). On an asymptotic theory of conditional and unconditional coverage probabilities of empirical Bayes confidence intervals. Scand. J. Statist. 29 139–152.
Mathematical Reviews (MathSciNet): MR1894387
Digital Object Identifier: doi:10.1111/1467-9469.t01-1-00143
Datta, G. S., Rao, J. N. K. and Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92 183–196.
Mathematical Reviews (MathSciNet): MR2158619
Zentralblatt MATH: 1068.62027
Digital Object Identifier: doi:10.1093/biomet/92.1.183
DiCiccio, T. and Efron, B. (1996). Bootstrap confidence intervals (with discussion). Statist. Sci. 11 189–228.
Mathematical Reviews (MathSciNet): MR1436647
Digital Object Identifier: doi:10.1214/ss/1032280214
Project Euclid: euclid.ss/1032280214
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York.
Mathematical Reviews (MathSciNet): MR1270903
Zentralblatt MATH: 0835.62038
Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of James–Stein procedure to census data. J. Amer. Statist. Assoc. 74 269–277.
Mathematical Reviews (MathSciNet): MR548019
Digital Object Identifier: doi:10.2307/2286322
Fushiki, T., Komaki, F. and Aihara, K. (2004). On parametric bootstrapping and Bayesian prediction. Scand. J. Statist. 31 403–416.
Mathematical Reviews (MathSciNet): MR2087833
Digital Object Identifier: doi:10.1111/j.1467-9469.2004.02_127.x
Fushiki, T., Komaki, F. and Aihara, K. (2005). Nonparametric bootstrap prediction. Bernoulli 11 293–307.
Mathematical Reviews (MathSciNet): MR2132728
Digital Object Identifier: doi:10.3150/bj/1116340296
Project Euclid: euclid.bj/1116340296
George, E. I., Liang, F. and Xu, X. (2006). Improved minimax predictive densities under Kullback–Leibler loss. Ann. Statist. 34 78–91.
Mathematical Reviews (MathSciNet): MR2275235
Digital Object Identifier: doi:10.1214/009053606000000155
Project Euclid: euclid.aos/1146576256
Hall, P. (2006). Discussion of “Mixed model prediction and small area estimation,” by J. Jiang and P. Lahiri. Test 15 1–96.
Mathematical Reviews (MathSciNet): MR2252522
Digital Object Identifier: doi:10.1007/BF02595419
Hall, P. and Maiti, T. (2006a). Nonparametric estimation of mean-squared prediction error in nested-error regression models. Ann. Statist. 34 1733–1750.
Mathematical Reviews (MathSciNet): MR2283715
Digital Object Identifier: doi:10.1214/009053606000000579
Project Euclid: euclid.aos/1162567631
Hall, P. and Maiti, T. (2006b). On parametric bootstrap methods for small-area prediction. J. Roy. Statist. Soc. Ser. B 68 221–238.
Mathematical Reviews (MathSciNet): MR2188983
Digital Object Identifier: doi:10.1111/j.1467-9868.2006.00541.x
Hall, P. and Martin, M. A. (1996). “Discussion on Bootstrap confidence intervals,” by DiCiccio and Efron. Statist. Sci. 11 212–214.
Mathematical Reviews (MathSciNet): MR1436647
Digital Object Identifier: doi:10.1214/ss/1032280214
Project Euclid: euclid.ss/1032280214
Harris, I. R. (1989). Predictive fit for natural exponential families. Biometrika 76 675–684.
Mathematical Reviews (MathSciNet): MR1041412
Zentralblatt MATH: 0679.62021
Digital Object Identifier: doi:10.1093/biomet/76.4.675
Hartigan, J. (1964). Invariant prior distributions. Ann. Math. Statist. 35 836–845.
Mathematical Reviews (MathSciNet): MR161406
Digital Object Identifier: doi:10.1214/aoms/1177703583
Project Euclid: euclid.aoms/1177703583
Hartigan, J. (1998). The maximum likelihood prior. Ann. Statist. 26 2083–2103.
Mathematical Reviews (MathSciNet): MR1700222
Digital Object Identifier: doi:10.1214/aos/1024691462
Project Euclid: euclid.aos/1024691462
Hill, J. R. (1990). A general framework for model-based statistics. Biometrika 77 115–126.
Mathematical Reviews (MathSciNet): MR1049413
Zentralblatt MATH: 0692.62003
Digital Object Identifier: doi:10.1093/biomet/77.1.115
Jeske, D. R. and Harville, D. A. (1988). Prediction-interval procedures and (fixed-effects) confidence-interval procedures for mixed linear models. Comm. Statist. Theory Methods 17 1053–1087.
Mathematical Reviews (MathSciNet): MR942969
Digital Object Identifier: doi:10.1080/03610928808829672
Jiang, J. (1996). REML estimation: Asymptotic behavior and related topics. Ann. Statist. 24 255–286.
Mathematical Reviews (MathSciNet): MR1389890
Digital Object Identifier: doi:10.1214/aos/1033066209
Project Euclid: euclid.aos/1033066209
Jiang, J. (1998). Asymptotic properties of the empirical BLUP and BLUE in mixed linear models. Statist. Sinica 8 861–885.
Mathematical Reviews (MathSciNet): MR1651513
Jiang, J. and Lahiri, P. (2006). Mixed model prediction and small area estimation (with discussions). Test 15 1–96.
Mathematical Reviews (MathSciNet): MR2252522
Digital Object Identifier: doi:10.1007/BF02595419
Jiang, J. and Zhang, W. (2002). Distribution-free prediction intervals in mixed linear models. Statist. Sinica 12 537–553.
Mathematical Reviews (MathSciNet): MR1902724
Zentralblatt MATH: 0998.62040
Jiang, J., Lahiri, P. and Wan, S. (2002). A unified jackknife theory for empirical best prediction with M-estimation. Ann. Statist. 30 1782–1810.
Mathematical Reviews (MathSciNet): MR1969450
Digital Object Identifier: doi:10.1214/aos/1043351257
Project Euclid: euclid.aos/1043351257
Komaki, F. (1996). On asymptotic properties of predictive distributions. Biometrika 83 299–313.
Mathematical Reviews (MathSciNet): MR1439785
Zentralblatt MATH: 0864.62007
Digital Object Identifier: doi:10.1093/biomet/83.2.299
Komaki, F. (2001). A shrinkage predictive distribution for multivariate normal observations. Biometrika 88 859–864.
Mathematical Reviews (MathSciNet): MR1859415
Zentralblatt MATH: 0985.62024
Digital Object Identifier: doi:10.1093/biomet/88.3.859
Komaki, F. (2006). Shrinkage priors for Bayesian prediction. Ann. Statist. 34 808–819.
Mathematical Reviews (MathSciNet): MR2283393
Digital Object Identifier: doi:10.1214/009053606000000010
Project Euclid: euclid.aos/1151418241
Laird, N. M. and Louis, T. A. (1987). Empirical Bayes confidence intervals based on bootstrap samples (with discussion). J. Amer. Statist. Assoc. 82 739–750.
Mathematical Reviews (MathSciNet): MR909979
Digital Object Identifier: doi:10.2307/2288778
Lee, S. M S. and Young, G. A. (1996). Discussion on “Bootstrap confidence intervals,” by DiCiccio and Efron. Statist. Sci. 11 221–223.
Mathematical Reviews (MathSciNet): MR1436647
Digital Object Identifier: doi:10.1214/ss/1032280214
Project Euclid: euclid.ss/1032280214
McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1884506
Zentralblatt MATH: 0964.62061
Morris, C. N. (1983a). Parametric empirical Bayes inference: Theory and applications (with discussion). J. Amer. Statist. Assoc. 78 47–65.
Mathematical Reviews (MathSciNet): MR696849
Digital Object Identifier: doi:10.2307/2287098
Morris, C. N. (1983b). Parametric empirical Bayes confidence intervals. In Scientific Inference, Data Analysis, and Robustness (G. E. P. Box, T. Leonard and C.-F. Wu, eds.) 25–50. Academic Press, Orlando, FL.
Mathematical Reviews (MathSciNet): MR772762
Zentralblatt MATH: 0581.62033
Murray, D. G. (1977). A note on the estimation of probability density functions. Biometrika 64 150–152.
Mathematical Reviews (MathSciNet): MR448690
Zentralblatt MATH: 0347.62035
Digital Object Identifier: doi:10.2307/2335788
Ng, V. M. (1980). On the estimation of parametric density functions. Biometrika 67 505–506.
Mathematical Reviews (MathSciNet): MR581751
Zentralblatt MATH: 0451.62006
Digital Object Identifier: doi:10.1093/biomet/67.2.505
Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators, J. Amer. Statist. Assoc. 85 163–171.
Mathematical Reviews (MathSciNet): MR1137362
Digital Object Identifier: doi:10.2307/2289539
Rao, C. R. (1965). Linear Statistical Inference and Its Applications. Wiley, New York.
Mathematical Reviews (MathSciNet): MR221616
Zentralblatt MATH: 0137.36203
Rao, J. N. K. (2003). Small Area Estimation. Wiley, Hoboken, NJ.
Mathematical Reviews (MathSciNet): MR1953089
Rao, J. N. K. (2005). Inferential issues in small area estimation: Some new developments. Statistics in Transition 7 513–526.
Yeh, A. B. and Singh, K. (1997). Balanced confidence regions based on Tukey’s depth and the bootstrap. J. Roy. Statist. Soc. Ser. B 59 639–652.
Mathematical Reviews (MathSciNet): MR1452031
Digital Object Identifier: doi:10.1111/1467-9868.00088

2009 © Institute of Mathematical Statistics