Source: Ann. Statist. Volume 32, Number 6
(2004), 2580-2615.
We propose and analyze nonparametric tests of the null hypothesis that a function belongs to a specified parametric family. The tests are based on BIC approximations, πBIC, to the posterior probability of the null model, and may be carried out in either Bayesian or frequentist fashion. We obtain results on the asymptotic distribution of πBIC under both the null hypothesis and local alternatives. One version of πBIC, call it πBIC*, uses a class of models that are orthogonal to each other and growing in number without bound as sample size, n, tends to infinity. We show that
(1−πBIC*) converges in distribution to a stable law under the null hypothesis. We also show that πBIC* can detect local alternatives converging to the null at the rate
. A particularly interesting finding is that the power of the πBIC*-based test is asymptotically equal to that of a test based on the maximum of alternative log-likelihoods.
Simulation results and an example involving variable star data illustrate desirable features of the proposed tests.
References
Aerts, M., Claeskens, G. and Hart, J. D. (1999). Testing the fit of a parametric function. J. Amer. Statist. Assoc. 94 869--879.
Aerts, M., Claeskens, G. and Hart, J. D. (2000). Testing lack of fit in multiple regression. Biometrika 87 405--424.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automatic Control 19 716--723.
Mathematical Reviews (MathSciNet):
MR423716
Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109--122.
Claeskens, G. and Hjort, N. L. (2004). Goodness of fit via nonparametric likelihood ratios. Scand. J. Statist. 31 487--514.
Drmač, Z., Omladič, M. and Veselić, K. (1994). On the perturbation of the Cholesky factorization. SIAM J. Matrix Anal. Appl. 15 1319--1332.
Eubank, R. L. and Hart, J. D. (1992). Testing goodness-of-fit in regression via order selection criteria. Ann. Statist. 20 1412--1425.
Fan, J. (1996). Test of significance based on wavelet thresholding and Neyman's truncation. J. Amer. Statist. Assoc. 91 674--688.
Fan, J. and Huang, L. (2001). Goodness-of-fit tests for parametric regression models. J. Amer. Statist. Assoc. 96 640--652.
Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press.
Golubov, B., Efimov, A. and Skvortsov, V. (1991). Walsh Series and Transforms: Theory and Applications. Kluwer, Dordrecht.
Good, I. J. (1957). Saddle-point methods for the multinomial distribution. Ann. Math. Statist. 28 861--881.
Mathematical Reviews (MathSciNet):
MR93866
Good, I. J. (1992). The Bayes/non-Bayes compromise: A brief review. J. Amer. Statist. Assoc. 87 597--606.
Götze, F. (1991). On the rate of convergence in the multivariate central limit theorem. Ann. Probab. 19 724--739.
Hart, J. D. (1997). Nonparametric Smoothing and Lack-of-Fit Tests. Springer, New York.
Hart, J. D., Koen, C. and Lombard, F. (2004). An analysis of pulsation periods of long-period variable stars. Submitted for publication.
Haughton, D. M. A. (1988). On the choice of a model to fit data from an exponential family. Ann. Statist. 16 342--355.
Mathematical Reviews (MathSciNet):
MR924875
Jeffreys, H. (1961). Theory of Probability, 3rd ed. Oxford Univ. Press.
Mathematical Reviews (MathSciNet):
MR187257
Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928--934.
Kass, R. E. and Wasserman, L. (1996). The selection of prior distributions by formal rules. J. Amer. Statist. Assoc. 91 1343--1370. [Correction (1998) 93 412.]
Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773--795.
Koen, C. and Lombard, F. (2001). The analysis of indexed astronomical time series---VII. Simultaneous use of times of maxima and minima to test for period changes in long-period variables. Monthly Notices of the Royal Astronomical Society 325 1124--1132.
Ledwina, T. (1994). Data-driven version of Neyman's smooth test of fit. J. Amer. Statist. Assoc. 89 1000--1005.
LePage, R., Woodroofe, M. and Zinn, J. (1981). Convergence to a stable distribution via order statistics. Ann. Probab. 9 624--632.
Mathematical Reviews (MathSciNet):
MR624688
Mattei, J. A. (1997). Introducing Mira variables. J. AAVSO 25 57--62.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, London.
Mathematical Reviews (MathSciNet):
MR727836
Neyman, J. (1937). ``Smooth'' test for goodness of fit. Skandinavisk Aktuarietidskrift 20 149--199.
Rayner, J. C. W. and Best, D. J. (1989). Smooth Tests of Goodness of Fit. Oxford Univ. Press.
Rayner, J. C. W. and Best, D. J. (1990). Smooth tests of goodness of fit: An overview. Internat. Statist. Rev. 58 9--17.
Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Ann. Statist. 11 416--431.
Mathematical Reviews (MathSciNet):
MR696056
Samorodnitsky, G. and Taqqu, M. S. (1994). Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman and Hall, New York.
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461--464.
Mathematical Reviews (MathSciNet):
MR468014
Szegö, G. (1975). Orthogonal Polynomials, 4th ed. Amer. Math. Soc., Providence, RI.
Mathematical Reviews (MathSciNet):
MR372517
Verdinelli, I. and Wasserman, L. (1998). Bayesian goodness-of-fit testing using infinite-dimensional exponential families. Ann. Statist. 26 1215--1241.
White, H. (1994). Estimation, Inference and Specification Analysis. Cambridge Univ. Press.