Statistical Science

Recent Developments in Bootstrap Methodology

A. C. Davison, D. V. Hinkley, and G. A. Young
Source: Statist. Sci. Volume 18, Issue 2 (2003), 141-157.

Abstract

Ever since its introduction, the bootstrap has provided both a powerful set of solutions for practical statisticians, and a rich source of theoretical and methodological problems for statistics. In this article, some recent developments in bootstrap methodology are reviewed and discussed. After a brief introduction to the bootstrap, we consider the following topics at varying levels of detail: the use of bootstrapping for highly accurate parametric inference; theoretical properties of nonparametric bootstrapping with unequal probabilities; subsampling and the m out of n bootstrap; bootstrap failures and remedies for superefficient estimators; recent topics in significance testing; bootstrap improvements of unstable classifiers and resampling for dependent data. The treatment is telegraphic rather than exhaustive.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1063994969
Digital Object Identifier: doi:10.1214/ss/1063994969
Mathematical Reviews number (MathSciNet): MR2026076

References

Baggerly, K. A. (1998). Empirical likelihood as a goodness-of-fit measure. Biometrika 85 535--547.
Mathematical Reviews (MathSciNet): MR1665869
Zentralblatt MATH: 0918.62043
Digital Object Identifier: doi:10.1093/biomet/85.3.535
Barndorff-Nielsen, O. E. (1980). Conditionality resolutions. Biometrika 67 293--310.
Mathematical Reviews (MathSciNet): MR581727
Zentralblatt MATH: 0434.62005
Digital Object Identifier: doi:10.2307/2335474
Barndorff-Nielsen, O. E. (1983). On a formula for the distribution of the maximum likelihood estimator. Biometrika 70 343--365.
Mathematical Reviews (MathSciNet): MR712023
Zentralblatt MATH: 0532.62006
Digital Object Identifier: doi:10.2307/2335549
Barndorff-Nielsen, O. E. (1986). Inference on full or partial parameters based on the standardized signed log likelihood ratio. Biometrika 73 307--322.
Mathematical Reviews (MathSciNet): MR855891
Zentralblatt MATH: 0605.62020
Digital Object Identifier: doi:10.2307/2336207
Barndorff-Nielsen, O. E. and Cox, D. R. (1979). Edgeworth and saddle-point approximations with statistical applications (with discussion). J. Roy. Statist. Soc. Ser. B 41 279--312.
Mathematical Reviews (MathSciNet): MR557595
Barndorff-Nielsen, O. E. and Cox, D. R. (1994). Inference and Asymptotics. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1317097
Zentralblatt MATH: 0826.62004
Beran R. J. (1986). Simulated power functions. Ann. Statist. 14 151--173.
Mathematical Reviews (MathSciNet): MR829560
Digital Object Identifier: doi:10.1214/aos/1176349847
Project Euclid: euclid.aos/1176349847
Zentralblatt MATH: 0622.62051
Beran, R. J. (1987). Prepivoting to reduce level error of confidence sets. Biometrika 74 457--468.
Mathematical Reviews (MathSciNet): MR909351
Zentralblatt MATH: 0663.62045
Digital Object Identifier: doi:10.2307/2336685
Beran, R. J. (1988). Prepivoting test statistics: A bootstrap view of asymptotic refinements. J. Amer. Statist. Assoc. 83 687--697.
Mathematical Reviews (MathSciNet): MR963796
Digital Object Identifier: doi:10.2307/2289292
Zentralblatt MATH: 0662.62024
Beran, R. J. (1995). Stein confidence sets and the bootstrap. Statist. Sinica 5 109--127.
Mathematical Reviews (MathSciNet): MR1329290
Beran, R. J. (1997). Diagnosing bootstrap success. Ann. Inst. Statist. Math. 49 l--24.
Mathematical Reviews (MathSciNet): MR1450689
Digital Object Identifier: doi:10.1023/A:1003114420352
Zentralblatt MATH: 0928.62035
Beran, R. J. (2003). The impact of the bootstrap on statistical algorithms and theory. Statist. Sci. 18 175--184 (this issue).
Mathematical Reviews (MathSciNet): MR2026078
Digital Object Identifier: doi:10.1214/ss/1063994972
Project Euclid: euclid.ss/1063994972
Bickel, P. J. and Freedman, D. A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9 1196--1217.
Mathematical Reviews (MathSciNet): MR630103
Digital Object Identifier: doi:10.1214/aos/1176345637
Project Euclid: euclid.aos/1176345637
Zentralblatt MATH: 0449.62034
Bickel, P. J. and Ghosh, J. K. (1990). A decomposition for the likelihood ratio statistic and the Bartlett correction--- a Bayesian argument. Ann. Statist. 18 1070--1090.
Mathematical Reviews (MathSciNet): MR1062699
Digital Object Identifier: doi:10.1214/aos/1176347740
Project Euclid: euclid.aos/1176347740
Zentralblatt MATH: 0727.62035
Bickel, P. J., Götze, F. and van Zwet, W. R. (1997). Resampling fewer than $n$ observations: Gains, losses, and remedies for losses. Statist. Sinica 7 1--32.
Mathematical Reviews (MathSciNet): MR1441142
Booth, J. G. and Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 93 262--272.
Mathematical Reviews (MathSciNet): MR1614632
Digital Object Identifier: doi:10.2307/2669622
Zentralblatt MATH: 1068.62516
Brazzale, A. R. (2000). Practical small-sample parametric inference. Ph.D. dissertation, Dept. Mathematics, Swiss Federal Institute of Technology, Lausanne.
Breiman, L. (1996a). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350--2383.
Mathematical Reviews (MathSciNet): MR1425957
Digital Object Identifier: doi:10.1214/aos/1032181158
Project Euclid: euclid.aos/1032181158
Zentralblatt MATH: 0867.62055
Breiman, L. (1996b). Bagging predictors. Machine Learning 24 123--140.
Bretagnolle, J. (1983). Lois limites du bootstrap de certaines fonctionelles. Ann. Inst. H. Poincaré Probab. Statist. 19 281--296.
Mathematical Reviews (MathSciNet): MR725561
Brumback, B. A. and Rice, J. A. (1998). Smoothing spline models for the analysis of nested and crossed samples of curves (with discussion). J. Amer. Statist. Assoc. 93 961--994.
Mathematical Reviews (MathSciNet): MR1649194
Digital Object Identifier: doi:10.2307/2669837
Zentralblatt MATH: 1064.62515
Bühlmann, P. (2002a). Bootstraps for time series. Statist. Sci. 17 52--72.
Mathematical Reviews (MathSciNet): MR1910074
Digital Object Identifier: doi:10.1214/ss/1023798998
Project Euclid: euclid.ss/1023798998
Zentralblatt MATH: 1013.62048
Bühlmann, P. (2002b). Sieve bootstrap with variable-length Markov chains for stationary categorical time series (with discussion). J. Amer. Statist. Assoc. 97 443--471.
Mathematical Reviews (MathSciNet): MR1941463
Digital Object Identifier: doi:10.1198/016214502760046998
Zentralblatt MATH: 1073.62551
Bühlmann, P. and Yu, B. (2002). Analyzing bagging. Ann. Statist. 30 927--961.
Mathematical Reviews (MathSciNet): MR1926165
Digital Object Identifier: doi:10.1214/aos/1031689014
Project Euclid: euclid.aos/1031689014
Zentralblatt MATH: 1029.62037
Canty, A. J., Davison, A. C., Hinkley, D. V. and Ventura, V. (2002). Bootstrap diagnostics. Preprint, Institute of Mathematics, Swiss Federal Institute of Technology, Lausanne.
Carpenter, J. (1999). Test inversion bootstrap confidence intervals. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 159--172.
Mathematical Reviews (MathSciNet): MR1664041
Digital Object Identifier: doi:10.1111/1467-9868.00169
Zentralblatt MATH: 0913.62032
Corcoran, S. A. (1998). Bartlett adjustment of empirical discrepancy statistics. Biometrika 85 967--972.
Cox, D. R. (1980). Local ancillarity. Biometrika 67 279--286.
Mathematical Reviews (MathSciNet): MR581725
Zentralblatt MATH: 0434.62004
Digital Object Identifier: doi:10.2307/2335472
Davison, A. C. (2003). Statistical Models. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1998913
Zentralblatt MATH: 1044.62001
Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1478673
Zentralblatt MATH: 0886.62001
Delgado, M. A. and González Manteiga, W. (2001). Significance testing in nonparametric regression based on the bootstrap. Ann. Statist. 29 1469--1507.
Mathematical Reviews (MathSciNet): MR1873339
Digital Object Identifier: doi:10.1214/aos/1013203462
Project Euclid: euclid.aos/1013203462
Zentralblatt MATH: 1043.62032
DiCiccio, T. J. and Efron, B. (1992). More accurate confidence intervals in exponential families. Biometrika 79 231--245.
Mathematical Reviews (MathSciNet): MR1185126
Zentralblatt MATH: 0752.62027
Digital Object Identifier: doi:10.2307/2336835
DiCiccio, T. J. and Efron, B. (1996). Bootstrap confidence intervals (with discussion). Statist. Sci. 11 189--228.
Mathematical Reviews (MathSciNet): MR1436647
Digital Object Identifier: doi:10.1214/ss/1032280214
Project Euclid: euclid.ss/1032280214
DiCiccio, T. J., Martin, M. A. and Stern, S. E. (2001). Simple and accurate one-sided inference from signed roots of likelihood ratios. Canad. J. Statist. 29 67--76.
Mathematical Reviews (MathSciNet): MR1834487
Digital Object Identifier: doi:10.2307/3316051
DiCiccio, T. J. and Romano, J. P. (1990). Nonparametric confidence limits by resampling methods and least favorable families. Internat. Statist. Rev. 58 59--76.
DiCiccio, T. J. and Romano, J. P. (1995). On bootstrap procedures for second-order accurate confidence limits in parametric models. Statist. Sinica 5 141--160.
Mathematical Reviews (MathSciNet): MR1329292
Durbin, J. (1980). Approximations for densities of sufficient estimators. Biometrika 67 311--333.
Mathematical Reviews (MathSciNet): MR581728
Zentralblatt MATH: 0436.62020
Digital Object Identifier: doi:10.2307/2335475
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1--26.
Mathematical Reviews (MathSciNet): MR515681
Digital Object Identifier: doi:10.1214/aos/1176344552
Project Euclid: euclid.aos/1176344552
Zentralblatt MATH: 0406.62024
Efron, B. (1983). Estimating the error rate of a prediction rule: Improvement on cross-validation. J. Amer. Statist. Assoc. 78 316--331.
Mathematical Reviews (MathSciNet): MR711106
Digital Object Identifier: doi:10.2307/2288636
Zentralblatt MATH: 0543.62079
Efron, B. (1986). How biased is the apparent error rate of a prediction rule? J. Amer. Statist. Assoc. 81 461--470.
Mathematical Reviews (MathSciNet): MR845884
Digital Object Identifier: doi:10.2307/2289236
Zentralblatt MATH: 0621.62073
Efron, B. (1987). Better bootstrap confidence intervals (with discussion). J. Amer. Statist. Assoc. 82 171--200.
Mathematical Reviews (MathSciNet): MR883345
Digital Object Identifier: doi:10.2307/2289144
Zentralblatt MATH: 0622.62039
Efron, B. and Hinkley, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information (with discussion). Biometrika 65 457--487.
Mathematical Reviews (MathSciNet): MR521817
Zentralblatt MATH: 0401.62002
Digital Object Identifier: doi:10.2307/2335893
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York.
Mathematical Reviews (MathSciNet): MR1270903
Zentralblatt MATH: 0835.62038
Efron, B. and Tibshirani, R. J. (1997). Improvements on cross-validation: The $.632+$ bootstrap method. J. Amer. Statist. Assoc. 92 548--560.
Mathematical Reviews (MathSciNet): MR1467848
Digital Object Identifier: doi:10.2307/2965703
Zentralblatt MATH: 0887.62044
Efron, B. and Tibshirani, R. J. (1998). The problem of regions. Ann. Statist. 26 1687--1718.
Mathematical Reviews (MathSciNet): MR1673274
Digital Object Identifier: doi:10.1214/aos/1024691353
Project Euclid: euclid.aos/1024691353
Zentralblatt MATH: 0954.62031
Fan, J. and Lin, S. (1998). Test of significance when data are curves. J. Amer. Statist. Assoc. 93 1007--1021.
Mathematical Reviews (MathSciNet): MR1649196
Digital Object Identifier: doi:10.2307/2669845
Zentralblatt MATH: 1064.62525
Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55 119--139.
Mathematical Reviews (MathSciNet): MR1473055
Digital Object Identifier: doi:10.1006/jcss.1997.1504
Zentralblatt MATH: 0880.68103
Garthwaite, P. H. and Buckland, S. T. (1992). Generating Monte Carlo confidence intervals by the Robbins--Monro process. Appl. Statist. 41 159--171.
Mathematical Reviews (MathSciNet): MR1151973
Digital Object Identifier: doi:10.2307/2347625
Hall, P. (1985). Resampling a coverage pattern. Stochastic Process. Appl. 20 231--246.
Mathematical Reviews (MathSciNet): MR808159
Digital Object Identifier: doi:10.1016/0304-4149(85)90212-1
Zentralblatt MATH: 0587.62081
Hall, P. (1986). On the bootstrap and confidence intervals. Ann. Statist. 14 1431--1452.
Mathematical Reviews (MathSciNet): MR868310
Digital Object Identifier: doi:10.1214/aos/1176350168
Project Euclid: euclid.aos/1176350168
Zentralblatt MATH: 0611.62047
Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.
Mathematical Reviews (MathSciNet): MR1145237
Hall, P. and Presnell, B. (1999a). Intentionally biased bootstrap methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 143--158.
Mathematical Reviews (MathSciNet): MR1664116
Digital Object Identifier: doi:10.1111/1467-9868.00168
Zentralblatt MATH: 0931.62036
Hall, P. and Presnell, B. (1999b). Biased bootstrap methods for reducing the effects of contamination. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 661--680.
Mathematical Reviews (MathSciNet): MR1707867
Digital Object Identifier: doi:10.1111/1467-9868.00199
Zentralblatt MATH: 0930.62029
Hall, P. and Presnell, B. (1999c). Density estimation under constraints. J. Comput. Graph. Statist. 8 259--277.
Mathematical Reviews (MathSciNet): MR1706365
Digital Object Identifier: doi:10.2307/1390636
Hall, P. and Wilson, S. R. (1991). Two guidelines for bootstrap hypothesis testing. Biometrics 47 757--762.
Mathematical Reviews (MathSciNet): MR1132543
Digital Object Identifier: doi:10.2307/2532163
Härdle, W. (1989). Resampling for inference from curves. Bull. Inst. Internat. Statist. 53 53--64.
Mathematical Reviews (MathSciNet): MR1093691
Härdle, W. (1990). Applied Nonparametric Regression. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1161622
Hesterberg, T. C. (1999). Bootstrap tilting confidence intervals and hypothesis tests. In Computer Science and Statistics: Proc. 31st Symposium on the Interface 389--393. Interface Foundation of North America, Inc., Fairfax Station, VA.
Hinkley, D. V. (1980). Likelihood as approximate pivotal distribution. Biometrika 67 287--292.
Mathematical Reviews (MathSciNet): MR581726
Zentralblatt MATH: 0434.62021
Digital Object Identifier: doi:10.2307/2335473
Huang, H. (2002). Scenario generation for multivariate series data using the nearest neighbor bootstrap. Ph.D. dissertation, Dept. Decision Sciences and Engineering, Rensselaer Polytechnic Institute, Troy, New York.
Lahiri, S. N. (2003). On the impact of bootstrap in survey sampling and small-area estimation. Statist. Sci. 18 199--210 (this issue).
Mathematical Reviews (MathSciNet): MR2019788
Digital Object Identifier: doi:10.1214/ss/1063994975
Project Euclid: euclid.ss/1063994975
Lawless, J. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York.
Mathematical Reviews (MathSciNet): MR640866
Zentralblatt MATH: 0541.62081
Lee, S. M. S. and Young, G. A. (2003). Prepivoting by weighted bootstrap iteration. Biometrika 90 393--410.
Mathematical Reviews (MathSciNet): MR1986655
Zentralblatt MATH: 1034.62030
Digital Object Identifier: doi:10.1093/biomet/90.2.393
Lee, Y. D. and Lahiri, S. N. (2002). Least squares variogram fitting by spatial subsampling. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 837--854.
Mathematical Reviews (MathSciNet): MR1979390
Digital Object Identifier: doi:10.1111/1467-9868.00364
Zentralblatt MATH: 1067.62100
Liu, R. Y. and Singh, K. (1997). Notions of limiting $P$ values based on data depth and bootstrap. J. Amer. Statist. Assoc. 92 266--277.
Mathematical Reviews (MathSciNet): MR1436115
Digital Object Identifier: doi:10.2307/2291471
Zentralblatt MATH: 0889.62010
Mammen, E. (1993). Bootstrap and wild bootstrap for high-dimensional linear models. Ann. Statist. 21 255--285.
Mathematical Reviews (MathSciNet): MR1212176
Digital Object Identifier: doi:10.1214/aos/1176349025
Project Euclid: euclid.aos/1176349025
Zentralblatt MATH: 0771.62032
Martin, M. A. (1990). On bootstrap iteration for coverage correction in confidence intervals. J. Amer. Statist. Assoc. 85 1105--1118.
Mathematical Reviews (MathSciNet): MR1134507
Digital Object Identifier: doi:10.2307/2289608
Zentralblatt MATH: 0736.62040
McCullagh, P. (2000). Resampling and exchangeable arrays. Bernoulli 6 285--301.
Mathematical Reviews (MathSciNet): MR1748722
Digital Object Identifier: doi:10.2307/3318577
Project Euclid: euclid.bj/1081788029
Zentralblatt MATH: 0976.62035
Newton, M. A. and Geyer, C. J. (1994). Bootstrap recycling: A Monte Carlo alternative to the nested bootstrap. J. Amer. Statist. Assoc. 89 905--912.
Mathematical Reviews (MathSciNet): MR1294734
Digital Object Identifier: doi:10.2307/2290915
Zentralblatt MATH: 0825.65131
Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75 237--249.
Mathematical Reviews (MathSciNet): MR946049
Zentralblatt MATH: 0641.62032
Digital Object Identifier: doi:10.2307/2336172
Owen, A. B. (2001). Empirical Likelihood. Chapman and Hall/CRC, Boca Raton, FL.
Zentralblatt MATH: 0989.62019
Politis, D. N. (2003). The impact of bootstrap methods on time series analysis. Statist. Sci. 18 219--230 (this issue).
Mathematical Reviews (MathSciNet): MR2026081
Digital Object Identifier: doi:10.1214/ss/1063994977
Project Euclid: euclid.ss/1063994977
Politis, D. N., Paparoditis, E. and Romano, J. P. (1999). Resampling marked point processes. In Multivariate Analysis, Design of Experiments, and Survey Sampling (S. Ghosh, ed.) 163--185. Dekker, New York.
Mathematical Reviews (MathSciNet): MR1719074
Zentralblatt MATH: 0946.62087
Politis, D. N., Romano, J. P. and Wolf, M. (1999). Subsampling. Springer, New York.
Mathematical Reviews (MathSciNet): MR1707286
Putter, H. and van Zwet, W. R. (1996). Resampling: Consistency of substitution estimators. Ann. Statist. 24 2297--2318.
Mathematical Reviews (MathSciNet): MR1425955
Digital Object Identifier: doi:10.1214/aos/1032181156
Project Euclid: euclid.aos/1032181156
Zentralblatt MATH: 0867.62036
Rajagopalan, B. and Lall, U. (1999). A $k$-nearest-neighbor simulator for daily precipitation and other weather variables. Water Resources Res. 35 3089--3101.
Samworth, R. J. (2003). A note on methods of restoring consistency to the bootstrap. Biometrika. To appear.
Mathematical Reviews (MathSciNet): MR2024773
Digital Object Identifier: doi:10.1093/biomet/90.4.985
Schapire, R. E., Freund, Y., Bartlett, P. and Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Statist. 26 1651--1686.
Mathematical Reviews (MathSciNet): MR1673273
Digital Object Identifier: doi:10.1214/aos/1024691352
Project Euclid: euclid.aos/1024691352
Zentralblatt MATH: 0929.62069
Severini, T. A. (2000). Likelihood Methods in Statistics. Clarendon, Oxford.
Mathematical Reviews (MathSciNet): MR1854870
Zentralblatt MATH: 0984.62002
Shao, J. (2003). Impact of the bootstrap on sample surveys. Statist. Sci. 18 191--198 (this issue).
Mathematical Reviews (MathSciNet): MR2019787
Digital Object Identifier: doi:10.1214/ss/1063994974
Project Euclid: euclid.ss/1063994974
Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer, New York.
Mathematical Reviews (MathSciNet): MR1351010
Zentralblatt MATH: 0947.62501
Singh, K. (1981). On the asymptotic accuracy of Efron's bootstrap. Ann. Statist. 91 1187--1195.
Mathematical Reviews (MathSciNet): MR630102
Digital Object Identifier: doi:10.1214/aos/1176345636
Project Euclid: euclid.aos/1176345636
Zentralblatt MATH: 0494.62048
Stute, W., González Manteiga, W. and Presedo Quindimil, M. (1998). Bootstrap approximations in model checks for regression. J. Amer. Statist. Assoc. 93 141--149.
Mathematical Reviews (MathSciNet): MR1614600
Digital Object Identifier: doi:10.2307/2669611
Zentralblatt MATH: 0902.62027
Ventura, V. (2002). Non-parametric bootstrap recycling. Statist. Comput. 12 261--273.
Mathematical Reviews (MathSciNet): MR1933512
Digital Object Identifier: doi:10.1023/A:1020754911317
Wang, Y. D. and Wahba, G. (1995). Bootstrap confidence intervals for smoothing splines and their comparison to Bayesian confidence intervals. J. Statist. Comput. Simulation 51 263--279.
Wu, C.-F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis (with discussion). Ann. Statist. 14 1261--1350.
Mathematical Reviews (MathSciNet): MR868303
Digital Object Identifier: doi:10.1214/aos/1176350142
Project Euclid: euclid.aos/1176350142
Zentralblatt MATH: 0618.62072

2012 © Institute of Mathematical Statistics

Statistical Science

Statistical Science