The Annals of Statistics

Curvature and inference for maximum likelihood estimates

Bradley Efron

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Maximum likelihood estimates are sufficient statistics in exponential families, but not in general. The theory of statistical curvature was introduced to measure the effects of MLE insufficiency in one-parameter families. Here, we analyze curvature in the more realistic venue of multiparameter families—more exactly, curved exponential families, a broad class of smoothly defined nonexponential family models. We show that within the set of observations giving the same value for the MLE, there is a “region of stability” outside of which the MLE is no longer even a local maximum. Accuracy of the MLE is affected by the location of the observation vector within the region of stability. Our motivating example involves “$g$-modeling,” an empirical Bayes estimation procedure.

Article information

Ann. Statist., Volume 46, Number 4 (2018), 1664-1692.

Received: November 2016
Revised: June 2017
First available in Project Euclid: 27 June 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62Bxx: Sufficiency and information
Secondary: 62Hxx: Multivariate analysis [See also 60Exx]

Observed information $g$-modeling region of stability curved exponential families regularized MLE


Efron, Bradley. Curvature and inference for maximum likelihood estimates. Ann. Statist. 46 (2018), no. 4, 1664--1692. doi:10.1214/17-AOS1598.

Export citation


  • Amari, S. (1982). Differential geometry of curved exponential families—Curvatures and information loss. Ann. Statist. 10 357–385.
  • Efron, B. (1975). Defining the curvature of a statistical problem (with applications to second order efficiency). Ann. Statist. 3 1189–1242. With a discussion by C. R. Rao, D. A. Pierce, D. R. Cox, D. V. Lindley, L. LeCam, J. K. Ghosh, J. Pfanzagl, N. Keiding, A. P. Dawid, J. Reeds and with a reply by the author.
  • Efron, B. (1978). The geometry of exponential families. Ann. Statist. 6 362–376.
  • Efron, B. (2010). Large-Scale Inference. Empirical Bayes Methods for Estimation, Testing, and Prediction. Institute of Mathematical Statistics (IMS) Monographs 1. Cambridge Univ. Press, Cambridge.
  • Efron, B. (2016). Empirical Bayes deconvolution estimates. Biometrika 103 1–20.
  • Efron, B. and Hinkley, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information. Biometrika 65 457–487. With comments by O. Barndorff-Nielsen, A. T. James, G. K. Robinson and D. A. Sprott and a reply by the authors.
  • Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. Lond. Ser. A 222 309–368. Available at
  • Fisher, R. A. (1925). Theory of statistical estimation. Math. Proc. Cambridge Philos. Soc. 22 700–725. DOI:10.1017/S0305004100009580.
  • Fisher, R. A. (1934). Two new properties of mathematical likelihood. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 144 285–307. Available at
  • Good, I. J. and Gaskins, R. A. (1971). Nonparametric roughness penalties for probability densities. Biometrika 58 255–277.
  • Hayashi, M. and Watanabe, S. (2016). Information geometry approach to parameter estimation in Markov chains. Ann. Statist. 44 1495–1535.
  • Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55–67. DOI:10.1080/00401706.1970.10488634.
  • Kass, R. E. and Vos, P. W. (1997). Geometrical Foundations of Asymptotic Inference. Wiley, New York.
  • Lindsey, J. K. (1974). Construction and comparison of statistical models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 36 418–425.
  • Madsen, L. T. (1979). The geometry of statistical model—A generalization of curvature. Technical report, Danish Medical Research Council. Statistical Research Unit Report 79-1.
  • Rao, C. R. (1961). Asymptotic efficiency and limiting information. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I 531–545. Univ. California Press, Berkeley, CA.
  • Rao, C. R. (1962). Efficient estimates and optimum inference procedures in large samples. J. R. Stat. Soc. Ser. B. Stat. Methodol. 24 46–72.
  • Rao, C. R. (1963). Criteria of estimation in large samples. Sankhya, Ser. A 25 189–206.
  • Schwartzman, A., Dougherty, R. F. and Taylor, J. E. (2005). Cross-subject comparison of principal diffusion direction maps. Magn. Reson. Med. 53 1423–1431. DOI:10.1002/mrm.20503.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 267–288.
  • Tierney, L. and Kadane, J. B. (1986). Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81 82–86.