Statistical Science

Integrated likelihood methods for eliminating nuisance parameters

James O. Berger, Brunero Liseo, and Robert L. Wolpert

Full-text: Open access


Elimination of nuisance parameters is a central problem in statistical inference and has been formally studied in virtually all approaches to inference. Perhaps the least studied approach is elimination of nuisance parameters through integration, in the sense that this is viewed as an almost incidental byproduct of Bayesian analysis and is hence not something which is deemed to require separate study. There is, however, considerable value in considering integrated likelihood on its own, especially versions arising from default or noninformative priors. In this paper, we review such common integrated likelihoods and discuss their strengths and weaknesses relative to other methods.

Article information

Statist. Sci. Volume 14, Number 1 (1999), 1-28.

First available in Project Euclid: 24 December 2001

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Marginal likelihood nuisance parameters profile likelihood reference priors


Berger, James O.; Liseo, Brunero; Wolpert, Robert L. Integrated likelihood methods for eliminating nuisance parameters. Statist. Sci. 14 (1999), no. 1, 1--28. doi:10.1214/ss/1009211804.

Export citation


  • Aitkin, M. and Stasinopoulos, M. (1989). Likelihood analysis of a binomial sample size problem. In Contributions to Probability and Statistics (L. J. Gleser, M. D. Perlman, S. J. Press and A. Sampson, eds.) Springer, New York.
  • Barnard, G. A., Jenkins, G. M. and Winsten, C. B. (1962). Likelihood inference and time series (with discussion). J. Roy. Statist. Soc. Ser. A 125 321-372.
  • Barndorff-Nielsen, O. (1983). On a formula for the distribution of the maximum likelihood estimator. Biometrika 70 343-365.
  • Barndorff-Nielsen, O. (1988). Parametric Statistical Models and Likelihood. Lecture Notes in Statist. 50. Springer, New York.
  • Barndorff-Nielsen, O. (1991). Likelihood theory. In Statistical Theory and Modelling: In Honour of Sir D.R. Cox. Chapman and Hall, London.
  • Bartlett, M. (1937). Properties of sufficiency and statistical tests. Proc. Roy. Soc. London Ser. A 160 268-282.
  • Basu, D. (1975). Statistical information and likelihood (with discussion). Sankhy¯a Ser. A 37 1-71.
  • Basu, D. (1977). On the elimination of nuisance parameters. J. Amer. Statist. Assoc. 72 355-366.
  • Bayarri, M. J., DeGroot, M. H. and Kadane, J. B. (1988). What is the likelihood function? In Statistical Decision Theory and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 2 3-27. Springer, New York,
  • Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis. Springer, New York.
  • Berger, J. O. and Bernardo, J. M. (1989). Estimating a product of means: Bayesian analysis with reference priors. J. Amer. Statist. Assoc. 84 200-207.
  • Berger, J. O. and Bernardo, J. M. (1992). Ordered group reference priors with applications to a multinomial problem. Biometrika 79 25-37.
  • Berger, J. O. and Berry, D. A. (1988). Statistical analysis and the illusion of objectivity. American Scientist 76 159-165.
  • Berger, J. O., Philippe, A. and Robert, C. (1998). Estimation of quadratic functions: noninformative priors for non-centrality parameters. Statist. Sinica 8 359-376.
  • Berger, J. O. and Strawderman, W. (1996). Choice of hierarchical priors: admissibility in estimation of normal means. Ann. Statist. 24 931-951.
  • Berger, J. O. and Wolpert, R. L. (1988). The Likelihood Principle: A Review, Generalizations, and Statistical Implications, 2nd ed. IMS, Hayward, CA.
  • Bernardo, J. M. (1979). Reference posterior distributions for Bayesian inference (with discussion). J. Roy. Statist. Soc. Ser. B 41 113-147.
  • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, New York.
  • Bjørnstad, J. (1996). On the generalization of the likelihood function and the likelihood principle. J. Amer. Statist. Assoc. 91 791-806.
  • Butler, R. W. (1988). A likely answer to "What is the likelihood function?" In Statistical Decision Theory and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 2 21-26. Springer, New York.
  • Carroll, R. J. and Lombard, F. (1985). A note on N estimators for the Binomial distribution. J. Amer. Statist. Assoc. 80 423- 426.
  • Chang, T. and Eaves, D. (1990). Reference priors for the orbit in a group model. Ann. Statist. 18 1595-1614.
  • Cox, D. R. (1975). Partial likelihood. Biometrika 62 269-276.
  • Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. Roy. Statist. Soc. Ser. B 49 1-39.
  • Cruddas, A. M., Reid, N. and Cox, D. R. (1989). A time series illustration of approximate conditional likelihood. Biometrika 76 231. Datta, G. S. and Ghosh, J. K. (1995a). Noninformative priors for maximal invariant parameter in group models. Test 4 95-114. Datta, G. S. and Ghosh, J. K. (1995b). On priors providing frequentist validity for Bayesian inference. Biometrika 82 37-45.
  • Datta, G. S. and Ghosh, J. K. (1996). On the invariance of noninformative priors. Ann. Statist. 24 141-159.
  • Dawid, A. P., Stone, M. and Zidek, J. V. (1973). Marginalization paradoxes in Bayesian and structural inference. J. Roy. Statist. Soc. Ser. B 35 180-233.
  • de Alba, E. and Mendoza, M. (1996). A discrete model for Bayesian forecasting with stable seasonal patterns. In Advances in Econometrics II (R. Carter Hill, ed.) 267-281. JAI Press.
  • Draper, N. and Guttman, I. (1971). Bayesian estimation of the binomial parameter. Technometrics 13 667-673.
  • Eaton, M. L. (1989). Group Invariance Applications in Statistics. IMS, Hayward, CA.
  • Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10 507.
  • Fisher, R. A. (1921). On the "probable error" of a coefficient of correlation deduced from a small sample. Metron 1 3-32.
  • Fisher, R. A. (1935). The fiducial argument in statistical inference. Ann. Eugenics 6 391-398.
  • Fraser, D. A. S. and Reid, N. (1989). Adjustments to profile likelihood. Biometrika 76 477-488.
  • Ghosh, J. K., ed. (1988). Statistical Information and Likelihood. A Collection of Critical Essays by D. Basu. Springer, New York.
  • Ghosh, J. K. and Mukerjee, R. (1992). Noninformative priors. In Bayesian Statistics 4 (J. O. Berger, J. M. Bernardo, A. P. Dawid and A. F. M. Smith, eds.) 195-203. Oxford Univ. Press.
  • Gleser, L. and Hwang, J. T. (1987). The nonexistence of 100 1 % confidence sets of finite expected diameter in errors-invariable and related models. Ann. Statist. 15 1351-1362.
  • Good, I. J. (1983). Good Thinking: The Foundations of Probability and Its Applications. Univ. Minnesota Press.
  • Hui, S. and Berger, J. O. (1983). Empirical Bayes estimation of rates in longitudinal studies. J. Amer. Statist. Assoc. 78 753-760.
  • Jeffreys, H. (1961). Theory of Probability. Oxford Univ. Press.
  • Kahn, W. D. (1987). A cautionary note for Bayesian estimation of the binomial parameter n. Amer. Statist. 41 38-39.
  • Kalbfleisch, J. D. and Sprott, D. A. (1970). Application of likelihood methods to models involving large numbers of parameters. J. Roy. Statist. Soc. Ser. B 32 175-208.
  • Kalbfleish, J. D. and Sprott, D. A. (1974). Marginal and conditional likelihood. Sankhy¯a Ser. A 35 311-328.
  • Laplace, P. S. (1812). Theorie Analytique des Probabilities Courcier, Paris.
  • Lavine, M. and Wasserman, L. A. (1992). Can we estimate N? Technical Report 546, Dept. Statistics, Carnegie Mellon Univ.
  • Liseo, B. (1993). Elimination of nuisance parameters with reference priors. Biometrika 80 295-304.
  • McCullagh, P. and Tibshirani, R. (1990). A simple method for the adjustment of profile likelihoods. J. Roy. Statist. Soc. Ser. B 52 325-344.
  • Moreno, E. and Gir ´on, F. Y. (1995). Estimating with incomplete count data: a Bayesian Approach. Technical report, Univ. Granada, Spain.
  • Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1-32.
  • Olkin, I., Petkau, A. J. and Zidek, J. V. (1981). A comparison of n estimators for the binomial distribution. J. Amer. Statist. Assoc. 76 637-642.
  • Raftery, A. E. (1988). Inference for the binomial N parameter: a hierarchical Bayes approach. Biometrika 75 223-228.
  • Reid, N. (1995). The roles of conditioning in inference. Statist. Sci. 10 138-157.
  • Reid, N. (1996). Likelihood and Bayesian approximation methods. In Bayesian Statistics 5 (J. O. Berger, J. M. Bernardo, A. P. Dawid and A. F. M. Smith, eds.) 351-369. Oxford Univ. Press.
  • Rissanen, J. (1983). A universal prior for integers and estimation by minimum description length. Ann. Statist. 11 416- 431.
  • Savage, L. J. (1976). On rereading R. A. Fisher. Ann. Statist. 4 441-500.
  • Sun, D. (1994). Integrable expansions for posterior distributions for a two parameter exponential family. Ann. Statist. 22 1808-1830.
  • Sun, D. and Berger, J. O. (1998). Reference priors with partial information. Biometrika 85 55-71. Sweeting, T. (1995a). A framework for Bayesian and likelihood approximations in statistics. Biometrika 82 1-24. Sweeting, T. (1995b). A Bayesian approach to approximate conditional inference. Biometrika 82 25-36.
  • Sweeting, T. (1996). Approximate Bayesian computation based on signed roots of log-density ratios. In Bayesian Statistics 5 (J. O. Berger, J. M. Bernardo, A. P. Dawid and A. F. M. Smith, eds.) 427-444. Oxford Univ. Press.
  • Ye, K. and Berger, J. O. (1991). Non-informative priors for inference in exponential regression models. Biometrika 78 645- 656.
  • Zabell, S. L. (1989). R.A. Fisher on the history of inverse probability. Statist. Sci. 4 247-263.
  • Efron, B. (1993). Bayes and likelihood calculations from confidence intervals. Biometrika 80 3-26.
  • Harris, I. (1989). Predictive fit for natural exponential families. Biometrika 76 675-684.
  • Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887-906.
  • Leonard, T. (1982). Comment on "A simple predictive density function" by M. Lejeune and G. D. Faulkenberry. J. Amer. Statist. Assoc. 77 657-658.
  • Liseo, B. and Sun, D. (1998). A general method of comparison for likelihoods. ISDS discussion paper, Duke Univ.
  • Tierney, L. J. and Kadane, J. B. (1986). Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81 82-86.