References
[1] Marc Aerts, Gerda Claeskens, and Jeffrey D. Hart. Testing the fit of a parametric function. J. Amer. Statist. Assoc., 94(447):869–879, 1999.
[2] Hirotugu Akaike. Statistical predictor identification. Ann. Inst. Statist. Math., 22:203–217, 1970.
[3] Hirotugu Akaike. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971), pages 267–281. Akadémiai Kiadó, Budapest, 1973.
[4] David M. Allen. The relationship between variable selection and data augmentation and a method for prediction. Technometrics, 16:125–127, 1974.
[5] Miguel A. Arcones and Evarist Giné. On the bootstrap of M-estimators and other statistical functionals. In Exploring the limits of bootstrap (East Lansing, MI, 1990), Wiley Ser. Probab. Math. Statist. Probab. Math. Statist., pages 13–47. Wiley, New York, 1992.
[6] Sylvain Arlot. Resampling and Model Selection. PhD thesis, University Paris-Sud 11, December 2007. oai:tel.archives-ouvertes.fr:tel-00198803_v1.
[7] Sylvain Arlot. Suboptimality of penalties proportional to the dimension for model selection in heteroscedastic regression, December 2008. arXiv:0812.3141v1
[8] Sylvain Arlot. Technical appendix to “Model selection by resampling penalization”, 2009. Appendix to hal-00262478.
[9] Sylvain Arlot. V-fold cross-validation improved: V-fold penalization, February 2008. arXiv:0802.0566v2.
[10] Sylvain Arlot, Gilles Blanchard, and Étienne Roquain. Some non-asymptotic results on resampling in high dimension, I: confidence regions. Ann. Statist., 2008. To appear.
[11] Sylvain Arlot and Pascal Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res., 10(Feb):245–279, 2009.
[12] Jean-Yves Audibert. Théorie Statistique de l’Apprentissage: une approche PAC-Bayésienne. PhD thesis, Université Paris VI, June 2004.
[13] Yannick Baraud. Model selection for regression on a fixed design. Probab. Theory Related Fields, 117(4):467–493, 2000.
[14] Yannick Baraud. Model selection for regression on a random design. ESAIM Probab. Statist., 6:127–146 (electronic), 2002.
[15] Philippe Barbe and Patrice Bertail. The weighted bootstrap, volume 98 of Lecture Notes in Statistics. Springer-Verlag, New York, 1995.
[16] Andrew Barron, Lucien Birgé, and Pascal Massart. Risk bounds for model selection via penalization. Probab. Theory Related Fields, 113(3):301–413, 1999.
[17] Peter L. Bartlett, Stéphane Boucheron, and Gábor Lugosi. Model selection and error estimation. Machine Learning, 48:85–113, 2002.
[18] Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson. Local Rademacher complexities. Ann. Statist., 33(4):1497–1537, 2005.
[19] Peter L. Bartlett, Shahar Mendelson, and Petra Philips. Local complexities for empirical risk minimization. In Learning theory, volume 3120 of Lecture Notes in Comput. Sci., pages 270–284. Springer, Berlin, 2004.
[20] Lucien Birgé and Pascal Massart. Gaussian model selection. J. Eur. Math. Soc. (JEMS), 3(3):203–268, 2001.
[21] Lucien Birgé and Pascal Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related Fields, 138(1-2):33–73, 2007.
[22] Prabir Burman. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76(3):503–514, 1989.
[23] Prabir Burman. Estimation of equifrequency histograms. Statist. Probab. Lett., 56(3):227–238, 2002.
[24] Olivier Catoni. Pac-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning, volume 56 of IMS Lecture Notes Monograph Series. Inst. Math. Statist., 2007.
[25] Joseph E. Cavanaugh and Robert H. Shumway. A bootstrap variant of AIC for state-space model selection. Statist. Sinica, 7(2):473–496, 1997.
[26] Luc Devroye and Gábor Lugosi. Combinatorial methods in density estimation. Springer Series in Statistics. Springer-Verlag, New York, 2001.
[27] David L. Donoho and Iain M. Johnstone. Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc., 90(432):1200–1224, 1995.
[28] Sam Efromovich and Mark Pinsker. Sharp-optimal and adaptive estimation for heteroscedastic nonparametric regression. Statist. Sinica, 6(4):925–942, 1996.
[29] Bradley Efron. Bootstrap methods: another look at the jackknife. Ann. Statist., 7(1):1–26, 1979.
[30] Bradley Efron. Estimating the error rate of a prediction rule: improvement on cross-validation. J. Amer. Statist. Assoc., 78(382):316–331, 1983.
[31] Bradley Efron. How biased is the apparent error rate of a prediction rule? J. Amer. Statist. Assoc., 81(394):461–470, 1986.
[32] Bradley Efron and Robert Tibshirani. Improvements on cross-validation: the .632+ bootstrap method. J. Amer. Statist. Assoc., 92(438):548–560, 1997.
[33] Magalie Fromont. Model selection by bootstrap penalization for classification. In Learning theory, volume 3120 of Lecture Notes in Comput. Sci., pages 285–299. Springer, Berlin, 2004.
[34] Magalie Fromont. Model selection by bootstrap penalization for classification. Mach. Learn., 66(2–3):165–207, 2007.
[35] Leonid Galtchouk and Sergey Pergamenshchikov. Adaptive asymptotically efficient estimation in heteroscedastic nonparametric regression via model selection, October 2008. arXiv:0810.1173.
[36] Seymour Geisser. The predictive sample reuse method with applications. J. Amer. Statist. Assoc., 70:320–328, 1975.
[37] Xavier Gendre. Simultaneous estimation of the mean and the variance in heteroscedastic Gaussian regression. Electronic Journal of Statistics, 2:1345–1372, 2008.
[38] László Györfi, Michael Kohler, Adam Krzyżak, and Harro Walk. A distribution-free theory of nonparametric regression. Springer Series in Statistics. Springer-Verlag, New York, 2002.
[39] Peter Hall. The bootstrap and Edgeworth expansion. Springer Series in Statistics. Springer-Verlag, New York, 1992.
[40] Peter Hall and Enno Mammen. On general resampling algorithms and their performance in distribution estimation. Ann. Statist., 22(4):2011–2030, 1994.
[41] Don Hush and Clint Scovel. Concentration of the hypergeometric distribution. Statist. Probab. Lett., 75(2):127–132, 2005.
[42] Marie Hušková and Paul Janssen. Consistency of the generalized bootstrap for degenerate U-statistics. Ann. Statist., 21(4):1811–1823, 1993.
[43] Makio Ishiguro, Yosiyuki Sakamoto, and Genshiro Kitagawa. Bootstrapping log likelihood and EIC, an extension of AIC. Ann. Inst. Statist. Math., 49(3):411–434, 1997.
[44] C. Matthew Jones and Anatoly A. Zhigljavsky. Approximating the negative moments of the Poisson distribution. Statist. Probab. Lett., 66(2):171–181, 2004.
[45] Vladimir Koltchinskii. Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory, 47(5):1902–1914, 2001.
[46] Vladimir Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist., 34(6):2593–2656, 2006.
[47] A. P. Korostelëv and A. B. Tsybakov. Minimax theory of image reconstruction, volume 82 of Lecture Notes in Statistics. Springer-Verlag, New York, 1993.
[48] Robert A. Lew. Bounds on negative moments. SIAM J. Appl. Math., 30(4):728–731, 1976.
[49] Ker-Chau Li. Asymptotic optimality for Cp, CL, cross-validation and generalized cross-validation: discrete index set. Ann. Statist., 15(3):958–975, 1987.
[50] Gábor Lugosi and Marten Wegkamp. Complexity regularization via localized random penalties. Ann. Statist., 32(4):1679–1697, 2004.
[51] Colin L. Mallows. Some comments on Cp. Technometrics, 15:661–675, 1973.
[52] Enno Mammen. When does bootstrap work? Asymptotic results and simulations, volume 77 of Lecture Notes in Statistics. Springer, 1992.
[53] Enno Mammen and Alexandre B. Tsybakov. Smooth discrimination analysis. Ann. Statist., 27(6):1808–1829, 1999.
[54] David M. Mason and Michael A. Newton. A rank statistics approach to the consistency of a general bootstrap. Ann. Statist., 20(3):1611–1624, 1992.
[55] Pascal Massart. Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard.
[56] Dimitris N. Politis, Joseph P. Romano, and Michael Wolf. Subsampling. Springer Series in Statistics. Springer-Verlag, New York, 1999.
[57] Jens Præstgaard and Jon A. Wellner. Exchangeably weighted bootstraps of the general empirical process. Ann. Probab., 21(4):2053–2086, 1993.
[58] Marie Sauvé. Histogram selection in non Gaussian regression. ESAIM: Probability and Statistics, 13:70–86, 2009.
[59] Jun Shao. Bootstrap model selection. J. Amer. Statist. Assoc., 91(434):655–665, 1996.
[60] Jun Shao. An asymptotic theory for linear model selection. Statist. Sinica, 7(2):221–264, 1997. With comments and a rejoinder by the author.
[61] Ritei Shibata. An optimal selection of regression variables. Biometrika, 68(1):45–54, 1981.
[62] Ritei Shibata. Bootstrap estimate of Kullback-Leibler information for model selection. Statist. Sinica, 7(2):375–394, 1997.
[63] Charles J. Stone. Optimal rates of convergence for nonparametric estimators. Ann. Statist., 8(6):1348–1360, 1980.
[64] Charles J. Stone. An asymptotically optimal histogram selection rule. In Proceedings of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983), Wadsworth Statist./Probab. Ser., pages 513–520, Belmont, CA, 1985. Wadsworth.
[65] Mervyn Stone. Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. Ser. B, 36:111–147, 1974. With discussion by G.A. Barnard, A.C. Atkinson, L.K. Chan, A.P. Dawid, F. Downton, J. Dickey, A.G. Baker, O. Barndorff-Nielsen, D.R. Cox, S. Giesser, D. Hinkley, R.R. Hocking, and A.S. Young, and with a reply by the authors.
[66] Aad W. van der Vaart and Jon A. Wellner. Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag, New York, 1996. With applications to statistics.
[67] Chien-Fu Jeff Wu. Jackknife, bootstrap and other resampling methods in regression analysis. Ann. Statist., 14(4):1261–1350, 1986. With discussion and a rejoinder by the author.
[68] Yuhong Yang. Consistency of cross validation for comparing regression procedures. Ann. Statist., 35(6):2450–2473, 2007.
[69] Yuhong Yang and Andrew Barron. Information-theoretic determination of minimax rates of convergence. Ann. Statist., 27(5):1564–1599, 1999.
[70] Marko Žnidarič. Asymptotic expansions for inverse moments of binomial and poisson distributions. arXiv:math.ST/0511226, November 2005.