Open Access
May 2015 Posterior Model Consistency in Variable Selection as the Model Dimension Grows
Elías Moreno, Javier Girón, George Casella
Statist. Sci. 30(2): 228-241 (May 2015). DOI: 10.1214/14-STS508
Abstract

Most of the consistency analyses of Bayesian procedures for variable selection in regression refer to pairwise consistency, that is, consistency of Bayes factors. However, variable selection in regression is carried out in a given class of regression models where a natural variable selector is the posterior probability of the models.

In this paper we analyze the consistency of the posterior model probabilities when the number of potential regressors grows as the sample size grows. The novelty in the posterior model consistency is that it depends not only on the priors for the model parameters through the Bayes factor, but also on the model priors, so that it is a useful tool for choosing priors for both models and model parameters.

We have found that some classes of priors typically used in variable selection yield posterior model inconsistency, while mixtures of these priors improve this undesirable behavior.

For moderate sample sizes, we evaluate Bayesian pairwise variable selection procedures by comparing their frequentist Type I and II error probabilities. This provides valuable information to discriminate between the priors for the model parameters commonly used for variable selection.

References

1.

Bartlett, M. (1957). A comment on D. V. Lindley’s statistical paradox. Biometrika 44 533–534.Bartlett, M. (1957). A comment on D. V. Lindley’s statistical paradox. Biometrika 44 533–534.

2.

Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109–122. MR1394065 10.1080/01621459.1996.10476668Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109–122. MR1394065 10.1080/01621459.1996.10476668

3.

Berger, J. O. and Pericchi, L. R. (2001). Objective Bayesian methods for model selection: Introduction and comparison. In Model Selection (P. Lahiri, ed.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 38 135–207. IMS, Beachwood, OH. MR2000753 10.1214/lnms/1215540968Berger, J. O. and Pericchi, L. R. (2001). Objective Bayesian methods for model selection: Introduction and comparison. In Model Selection (P. Lahiri, ed.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 38 135–207. IMS, Beachwood, OH. MR2000753 10.1214/lnms/1215540968

4.

Casella, G. and Moreno, E. (2006). Objective Bayesian variable selection. J. Amer. Statist. Assoc. 101 157–167. MR2268035 10.1198/016214505000000646Casella, G. and Moreno, E. (2006). Objective Bayesian variable selection. J. Amer. Statist. Assoc. 101 157–167. MR2268035 10.1198/016214505000000646

5.

Casella, G., Girón, F. J., Martínez, M. L. and Moreno, E. (2009). Consistency of Bayesian procedures for variable selection. Ann. Statist. 37 1207–1228. MR2509072 10.1214/08-AOS606 euclid.aos/1239369020 Casella, G., Girón, F. J., Martínez, M. L. and Moreno, E. (2009). Consistency of Bayesian procedures for variable selection. Ann. Statist. 37 1207–1228. MR2509072 10.1214/08-AOS606 euclid.aos/1239369020

6.

Clyde, M. and George, E. I. (2000). Flexible empirical Bayes estimation for wavelets. J. R. Stat. Soc. Ser. B. Stat. Methodol. 62 681–698. MR1796285 10.1111/1467-9868.00257Clyde, M. and George, E. I. (2000). Flexible empirical Bayes estimation for wavelets. J. R. Stat. Soc. Ser. B. Stat. Methodol. 62 681–698. MR1796285 10.1111/1467-9868.00257

7.

Clyde, M. and George, E. I. (2004). Model uncertainty. Statist. Sci. 19 81–94. MR2082148 10.1214/088342304000000035 euclid.ss/1089808274 Clyde, M. and George, E. I. (2004). Model uncertainty. Statist. Sci. 19 81–94. MR2082148 10.1214/088342304000000035 euclid.ss/1089808274

8.

Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391–401. MR1649120 10.1093/biomet/85.2.391Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391–401. MR1649120 10.1093/biomet/85.2.391

9.

Consonni, G., Forster, J. J. and La Rocca, L. (2015). The whetstone and the alum block: Balanced objective Bayesian comparison of nested models for discrete data. Statist. Sci. 28 398–423. MR3135539 10.1214/13-STS433 euclid.ss/1377696943 Consonni, G., Forster, J. J. and La Rocca, L. (2015). The whetstone and the alum block: Balanced objective Bayesian comparison of nested models for discrete data. Statist. Sci. 28 398–423. MR3135539 10.1214/13-STS433 euclid.ss/1377696943

10.

Dawid, A. P. (2011). Posterior model probabilities. In Philosophy of Statistics (P. S. Bandyopadhyay and M. Forster, eds.) 607–630. Elsevier, Amsterdam.Dawid, A. P. (2011). Posterior model probabilities. In Philosophy of Statistics (P. S. Bandyopadhyay and M. Forster, eds.) 607–630. Elsevier, Amsterdam.

11.

Fernández, C., Ley, E. and Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. J. Econometrics 100 381–427. MR1820410 10.1016/S0304-4076(00)00076-2Fernández, C., Ley, E. and Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. J. Econometrics 100 381–427. MR1820410 10.1016/S0304-4076(00)00076-2

12.

Fraser, D. A. S. (2011). Is Bayes posterior just quick and dirty confidence? Statist. Sci. 26 299–316. MR2918001 10.1214/11-STS352 euclid.ss/1320066918 Fraser, D. A. S. (2011). Is Bayes posterior just quick and dirty confidence? Statist. Sci. 26 299–316. MR2918001 10.1214/11-STS352 euclid.ss/1320066918

13.

George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731–747. MR1813972 10.1093/biomet/87.4.731George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731–747. MR1813972 10.1093/biomet/87.4.731

14.

George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.

15.

George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–374.George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–374.

16.

Girón, F. J., Martínez, M. L., Moreno, E. and Torres, F. (2006). Objective testing procedures in linear models: Calibration of the $p$-values. Scand. J. Stat. 33 765–784. MR2300915 10.1111/j.1467-9469.2006.00514.xGirón, F. J., Martínez, M. L., Moreno, E. and Torres, F. (2006). Objective testing procedures in linear models: Calibration of the $p$-values. Scand. J. Stat. 33 765–784. MR2300915 10.1111/j.1467-9469.2006.00514.x

17.

Girón, F. J., Moreno, E., Casella, G. and Martínez, M. L. (2010). Consistency of objective Bayes factors for nonnested linear models and increasing model dimension. Rev. R. Acad. Cienc. Exactas Fís. Nat. Ser. A Math. RACSAM 104 57–67. MR2666441 10.5052/RACSAM.2010.06Girón, F. J., Moreno, E., Casella, G. and Martínez, M. L. (2010). Consistency of objective Bayes factors for nonnested linear models and increasing model dimension. Rev. R. Acad. Cienc. Exactas Fís. Nat. Ser. A Math. RACSAM 104 57–67. MR2666441 10.5052/RACSAM.2010.06

18.

Hansen, M. H. and Yu, B. (2001). Model selection and the principle of minimum description length. J. Amer. Statist. Assoc. 96 746–774. MR1939352 10.1198/016214501753168398Hansen, M. H. and Yu, B. (2001). Model selection and the principle of minimum description length. J. Amer. Statist. Assoc. 96 746–774. MR1939352 10.1198/016214501753168398

19.

Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. J. Amer. Statist. Assoc. 107 649–660. MR2980074 10.1080/01621459.2012.682536Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. J. Amer. Statist. Assoc. 107 649–660. MR2980074 10.1080/01621459.2012.682536

20.

Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928–934. MR1354008 10.1080/01621459.1995.10476592Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928–934. MR1354008 10.1080/01621459.1995.10476592

21.

Leon-Novelo, L., Moreno, E. and Casella, G. (2012). Objective Bayes model selection in probit models. Stat. Med. 31 353–365. MR2879809 10.1002/sim.4406Leon-Novelo, L., Moreno, E. and Casella, G. (2012). Objective Bayes model selection in probit models. Stat. Med. 31 353–365. MR2879809 10.1002/sim.4406

22.

Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of $g$ priors for Bayesian variable selection. J. Amer. Statist. Assoc. 103 410–423. MR2420243 10.1198/016214507000001337Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of $g$ priors for Bayesian variable selection. J. Amer. Statist. Assoc. 103 410–423. MR2420243 10.1198/016214507000001337

23.

Moreno, E. (1997). Bayes factors for intrinsic and fractional priors in nested models. Bayesian robustness. In $L_1$-Statistical Procedures and Related Topics (Neuchâtel, 1997). Institute of Mathematical Statistics Lecture Notes—Monograph Series 31 257–270. IMS, Hayward, CA. MR1833592 10.1214/lnms/1215454142Moreno, E. (1997). Bayes factors for intrinsic and fractional priors in nested models. Bayesian robustness. In $L_1$-Statistical Procedures and Related Topics (Neuchâtel, 1997). Institute of Mathematical Statistics Lecture Notes—Monograph Series 31 257–270. IMS, Hayward, CA. MR1833592 10.1214/lnms/1215454142

24.

Moreno, E., Bertolino, F. and Racugno, W. (1998). An intrinsic limiting procedure for model selection and hypotheses testing. J. Amer. Statist. Assoc. 93 1451–1460. MR1666640 10.1080/01621459.1998.10473805Moreno, E., Bertolino, F. and Racugno, W. (1998). An intrinsic limiting procedure for model selection and hypotheses testing. J. Amer. Statist. Assoc. 93 1451–1460. MR1666640 10.1080/01621459.1998.10473805

25.

Moreno, E. and Girón, F. J. (2005). Consistency of Bayes factors for intrinsic priors in normal linear models. C. R. Math. Acad. Sci. Paris 340 911–914. MR2152278 10.1016/j.crma.2005.05.001Moreno, E. and Girón, F. J. (2005). Consistency of Bayes factors for intrinsic priors in normal linear models. C. R. Math. Acad. Sci. Paris 340 911–914. MR2152278 10.1016/j.crma.2005.05.001

26.

Moreno, E. and Girón, F. J. (2008). Comparison of Bayesian objective procedures for variable selection in linear regression. TEST 17 472–490. MR2470092 10.1007/s11749-006-0039-1Moreno, E. and Girón, F. J. (2008). Comparison of Bayesian objective procedures for variable selection in linear regression. TEST 17 472–490. MR2470092 10.1007/s11749-006-0039-1

27.

Moreno, E., Girón, F. J. and Casella, G. (2010). Consistency of objective Bayes factors as the model dimension grows. Ann. Statist. 38 1937–1952. MR2676879 10.1214/09-AOS754 euclid.aos/1278861238 Moreno, E., Girón, F. J. and Casella, G. (2010). Consistency of objective Bayes factors as the model dimension grows. Ann. Statist. 38 1937–1952. MR2676879 10.1214/09-AOS754 euclid.aos/1278861238

28.

Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464. MR468014 10.1214/aos/1176344136 euclid.aos/1176344136 Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464. MR468014 10.1214/aos/1176344136 euclid.aos/1176344136

29.

Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587–2619. MR2722450 10.1214/10-AOS792 euclid.aos/1278861454 Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587–2619. MR2722450 10.1214/10-AOS792 euclid.aos/1278861454

30.

Shao, J. (1997). An asymptotic theory for linear model selection. Statist. Sinica 7 221–264. MR1466682Shao, J. (1997). An asymptotic theory for linear model selection. Statist. Sinica 7 221–264. MR1466682

31.

Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with $g$-prior distributions. In Bayesian Inference and Decision Techniques (P. K. Goel and A. Zellner, eds.). Stud. Bayesian Econometrics Statist. 6 233–243. North-Holland, Amsterdam. MR881437Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with $g$-prior distributions. In Bayesian Inference and Decision Techniques (P. K. Goel and A. Zellner, eds.). Stud. Bayesian Econometrics Statist. 6 233–243. North-Holland, Amsterdam. MR881437

32.

Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics. Proceedings of the First Valencia International Meeting (J. M. Bernardo, M. H. DeGroot, D. V. Lindely and A. F. M. Smith, eds.) 585–603. Univ. Valencia Press, Valencia. MR638871Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics. Proceedings of the First Valencia International Meeting (J. M. Bernardo, M. H. DeGroot, D. V. Lindely and A. F. M. Smith, eds.) 585–603. Univ. Valencia Press, Valencia. MR638871
Copyright © 2015 Institute of Mathematical Statistics
Elías Moreno, Javier Girón, and George Casella "Posterior Model Consistency in Variable Selection as the Model Dimension Grows," Statistical Science 30(2), 228-241, (May 2015). https://doi.org/10.1214/14-STS508
Published: May 2015
Vol.30 • No. 2 • May 2015
Back to Top