The Annals of Statistics

A maximum likelihood method for the incidental parameter problem

Marcelo J. Moreira

Full-text: Open access


This paper uses the invariance principle to solve the incidental parameter problem of [Econometrica 16 (1948) 1–32]. We seek group actions that preserve the structural parameter and yield a maximal invariant in the parameter space with fixed dimension. M-estimation from the likelihood of the maximal invariant statistic yields the maximum invariant likelihood estimator (MILE). Consistency of MILE for cases in which the likelihood of the maximal invariant is the product of marginal likelihoods is straightforward. We illustrate this result with a stationary autoregressive model with fixed effects and an agent-specific monotonic transformation model.

Asymptotic properties of MILE, when the likelihood of the maximal invariant does not factorize, remain an open question. We are able to provide consistent, asymptotically normal and efficient results of MILE when invariance yields Wishart distributions. Two examples are an instrumental variable (IV) model and a dynamic panel data model with fixed effects.

Article information

Ann. Statist., Volume 37, Number 6A (2009), 3660-3696.

First available in Project Euclid: 17 August 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: C13 C23 60K35: Interacting random processes; statistical mechanics type models; percolation theory [See also 82B43, 82C43]
Secondary: C30

Incidental parameters invariance maximum likelihood estimator limits of experiments


Moreira, Marcelo J. A maximum likelihood method for the incidental parameter problem. Ann. Statist. 37 (2009), no. 6A, 3660--3696. doi:10.1214/09-AOS688.

Export citation


  • [1] Abramowitz, M. and Stegun, I. A. (1965). Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables. Dover, New York.
  • [2] Abrevaya, J. (2000). Rank estimation of a generalized fixed-effects regression model. J. Econometrics 95 1–23.
  • [3] Alvarez, J. and Arellano, M. (2003). The time-series and cross-section asymptotics of dynamic panel data estimators. Econometrica 71 1121–1159.
  • [4] Andersen, E. B. (1970). Asymptotic properties of conditional maximum likelihood estimators. J. Roy. Statist. Soc. Ser. B 32 283–301.
  • [5] Anderson, T. W. (1946). The noncentral Wishart distribution and certain problems of multivariate statistics. Ann. Math. Statist. 17 409–431.
  • [6] Anderson, T. W., Kunitomo, N. and Matsushita, Y. (2006). A new light from old wisdoms: Alternative estimation methods of simultaneous equations and microeconometric models. Unpublished manuscript, Univ. Tokyo.
  • [7] Andrews, D. W. K. (1992). Generic uniform convergence. Econometric Theory 8 241–256.
  • [8] Andrews, D. W. K., Moreira, M. J. and Stock, J. H. (2006). Optimal two-sided invariant similar tests for instrumental variables regression. Econometrica 74 715–752.
  • [9] Basu, D. (1977). Asymptotic properties of conditional maximum likelihood estimators. J. Amer. Statist. Assoc. 72 355–366.
  • [10] Bekker, P. A. (1994). Alternative approximations to the distributions of instrumental variables estimators. Econometrica 62 657–681.
  • [11] Bhowmik, J. L. and King, M. L. (2008). Parameter estimation in semi-linear models using a maximal invariant likelihood function. J. Statist. Plann. Inference 139 1276–1285.
  • [12] Chamberlain, G. (2007). Decision theory applied to an instrumental variables model. Econometrica 75 609–652.
  • [13] Chamberlain, G. and Moreira, M. J. (2008). Decision theory applied to a linear panel data model. Econometrica 77 107–133.
  • [14] Chioda, L. and Jansson, M. (2007). Optimal invariant inference when the number of instruments is large. Unpublished manuscript, Univ. California, Berkeley.
  • [15] Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference. J. Roy. Statist. Soc. Ser. B 49 1–39.
  • [16] Eaton, M. (1989). Group Invariance Applications in Statistics. Regional Conference Series in Probability and Statistics 1. IMS, Hayward, CA.
  • [17] Hahn, J. and Kuersteiner, G. (2002). Asymptotically unbiased inference for a dynamic panel model with fixed effects when both N and T are large. Econometrica 70 1639–1657.
  • [18] Harville, D. (1974). Bayesian inference for variance components using only error contrasts. Biometrika 61 383–385.
  • [19] Johnson, N. L. and Kotz, S. (1970). Distributions in Statistics: Continuous Multivariate Distributions. Wiley, New York.
  • [20] Kalbfleisch, J. D. and Sprott, D. A. (1970). Application of likelihood methods to models involving large numbers of parameters. J. Roy. Statist. Soc. Ser. B 32 175–208.
  • [21] Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887–906.
  • [22] Kunitomo, N. (1980). Asymptotic expansions of distributions of estimators in a linear functional relationship and simultaneous equations. J. Amer. Statist. Assoc. 75 693–700.
  • [23] Laskar, M. R. and King, M. L. (1998). Estimation and testing of regression disturbances based on modified likelihood functions. J. Statist. Plann. Inference 71 75–92.
  • [24] Laskar, M. R. and King, M. L. (2001). Modified likelihood and related methods for handling nuisance parameters in the linear regression model. In Data Analysis from Statistical Foundations (A. K. M. E. Saleh, ed.) 119–142. Nova Science Publishers Inc., Huntington, NY.
  • [25] Lancaster, T. (2002). Orthogonal parameters and panel data. Rev. Econom. Stud. 69 647–666.
  • [26] Le Cam, L. and Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts, 2nd ed. Springer, New York.
  • [27] Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer, New York.
  • [28] Lele, S. R. and McCulloch, C. E. (2002). Invariance, identifiability, and morphometrics. J. Amer. Statist. Assoc. 97 796–806.
  • [29] Liang, K.-Y. and Zeger, S. L. (1995). Inference based on estimating functions in the presence of nuisance parameters. Statist. Sci. 10 158–173.
  • [30] Magnus, J. R. and Neudecker, H. (1988). Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York.
  • [31] Morimune, K. (1983). Approximate distributions of k-class estimators when the degree of overidentification is large compared with sample size. Econometrica 51 821–841.
  • [32] Muirhead, R. J. (2005). Aspects of Multivariate Statistical Theory. Wiley, New York.
  • [33] Newey, W. and McFadden, D. L. (1994). Large sample estimation and hypothesis testing. In Handbook of Econometrics (R. F. Engle and D. L. McFadden, eds.) 4 2111–2245. North-Holland, Amsterdam.
  • [34] Newey, W. K. (2004). Efficient semiparametric estimation via moment restrictions. Econometrica 72 1877–1897.
  • [35] Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1–32.
  • [36] Potscher, B. M. and Prucha, I. R. (1997). Dynamic Nonlinear Econometric Models. Springer, Berlin.
  • [37] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.