The Annals of Statistics

Endogeneity in high dimensions

Jianqing Fan and Yuan Liao

Full-text: Open access


Most papers on high-dimensional statistics are based on the assumption that none of the regressors are correlated with the regression error, namely, they are exogenous. Yet, endogeneity can arise incidentally from a large pool of regressors in a high-dimensional regression. This causes the inconsistency of the penalized least-squares method and possible false scientific discoveries. A necessary condition for model selection consistency of a general class of penalized regression methods is given, which allows us to prove formally the inconsistency claim. To cope with the incidental endogeneity, we construct a novel penalized focused generalized method of moments (FGMM) criterion function. The FGMM effectively achieves the dimension reduction and applies the instrumental variable methods. We show that it possesses the oracle property even in the presence of endogenous predictors, and that the solution is also near global minimum under the over-identification assumption. Finally, we also show how the semi-parametric efficiency of estimation can be achieved via a two-step approach.

Article information

Ann. Statist., Volume 42, Number 3 (2014), 872-917.

First available in Project Euclid: 20 May 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F12: Asymptotic properties of estimators
Secondary: 62J02: General nonlinear regression 62J12: Generalized linear models 62P20: Applications to economics [See also 91Bxx]

Focused GMM sparsity recovery endogenous variables oracle property conditional moment restriction estimating equation over identification global minimization semiparametric efficiency


Fan, Jianqing; Liao, Yuan. Endogeneity in high dimensions. Ann. Statist. 42 (2014), no. 3, 872--917. doi:10.1214/13-AOS1202.

Export citation


  • Ai, C. and Chen, X. (2003). Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica 71 1795–1843.
  • Andrews, D. W. K. (1999). Consistent moment selection procedures for generalized method of moments estimation. Econometrica 67 543–564.
  • Andrews, D. W. K. and Lu, B. (2001). Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models. J. Econometrics 101 123–164.
  • Antoniadis, A. (1996). Smoothing noisy data with tapered coiflets series. Scand. J. Stat. 23 313–330.
  • Belloni, A. and Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli 19 521–547.
  • Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection amongst high-dimensional controls. Rev. Econ. Stud. To appear.
  • Belloni, A., Chen, D., Chernozhukov, V. and Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80 2369–2429.
  • Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1998). Efficient and Adaptive Estimation for Semiparametric Models. Springer, New York.
  • Bondell, H. D. and Reich, B. J. (2012). Consistent high-dimensional Bayesian variable selection via penalized credible regions. J. Amer. Statist. Assoc. 107 1610–1624.
  • Bradic, J., Fan, J. and Wang, W. (2011). Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 325–349.
  • Breheny, P. and Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 232–253.
  • Bühlmann, P., Kalisch, M. and Maathuis, M. H. (2010). Variable selection in high-dimensional linear models: Partially faithful distributions and the PC-simple algorithm. Biometrika 97 261–278.
  • Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, New York.
  • Canay, I., Santos, A. and Shaikh, A. (2013). On the testability of identification in some nonparametric odes with endogeneity. Econometrica 81 2535–2559.
  • Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • Caner, M. (2009). Lasso-type GMM estimator. Econometric Theory 25 270–290.
  • Caner, M. and Fan, Q. (2012). Hybrid generalized empirical likelihood estimators: Instrument selection with adaptive lasso. Unpublished manuscript.
  • Caner, M. and Zhang, H. (2014). Adaptive elastic net GMM with diverging number of moments. J. Bus. Econom. Statist. 32 30–47.
  • Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment restrictions. J. Econometrics 34 305–334.
  • Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. In Handbook of Econometrics VI (J. J. Heckman and E. E. Leamer, eds.). Chapter 76. North-Holland, Amsterdam.
  • Chen, X. and Pouzo, D. (2012). Estimation of nonparametric conditional moment models with possibly nonsmooth generalized residuals. Econometrica 80 277–321.
  • Chernozhukov, V. and Hong, H. (2003). An MCMC approach to classical estimation. J. Econometrics 115 293–346.
  • Daubechies, I., Defrise, M. and De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57 1413–1457.
  • Domínguez, M. A. and Lobato, I. N. (2004). Consistent estimation of models defined by conditional moment restrictions. Econometrica 72 1601–1615.
  • Donald, S. G., Imbens, G. W. and Newey, W. K. (2009). Choosing instrumental variables in conditional moment restriction models. J. Econometrics 152 28–36.
  • Engle, R. F., Hendry, D. F. and Richard, J.-F. (1983). Exogeneity. Econometrica 51 277–304.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Liao, Y. (2012). Endogeity in ultra high dimensions Unpublished manuscript.
  • Fan, J. and Lv, J. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE Trans. Inform. Theory 57 5467–5484.
  • Fan, J. and Yao, Q. (1998). Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85 645–660.
  • Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. J. Comput. Graph. Statist. 7 397–416.
  • García, E. (2011). Linear regression with a large number of weak instruments using a post-$l_1$-penalized estimator. Unpublished manuscript.
  • Gautier, E. and Tsybakov, A. (2011). High dimensional instrumental variables regression and confidence sets. Unpublished manuscript.
  • Hall, P. and Horowitz, J. L. (2005). Nonparametric methods for inference in the presence of instrumental variables. Ann. Statist. 33 2904–2929.
  • Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50 1029–1054.
  • Horowitz, J. L. (1992). A smoothed maximum score estimator for the binary response model. Econometrica 60 505–531.
  • Huang, J., Horowitz, J. L. and Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Statist. 36 587–613.
  • Huang, J., Ma, S. and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
  • Hunter, D. R. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617–1642.
  • Kim, Y., Choi, H. and Oh, H.-S. (2008). Smoothly clipped absolute deviation on high dimensions. J. Amer. Statist. Assoc. 103 1665–1673.
  • Kitamura, Y., Tripathi, G. and Ahn, H. (2004). Empirical likelihood-based inference in conditional moment restriction models. Econometrica 72 1667–1714.
  • Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 57 425–437.
  • Leeb, H. and Pötscher, B. M. (2008). Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econometrics 142 201–211.
  • Liao, Z. (2013). Adaptive GMM shrinkage estimation with consistent moment selection. Econometric Theory 29 857–904.
  • Loh, P. and Wainwright, M. (2013). Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima. Unpublished manuscript.
  • Lv, J. and Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Ann. Statist. 37 3498–3528.
  • Newey, W. (1990). Semiparametric efficiency bound. J. Appl. Econometrics 5 99–125.
  • Newey, W. K. (1993). Efficient estimation of models with conditional moment restrictions. In Econometrics. Handbook of Statist. 11 419–454. North-Holland, Amsterdam.
  • Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis testing. In Handbook of Econometrics, Vol. IV (R. Engle and D. McFadden, eds.). Handbooks in Econom. 2 2111–2245. North-Holland, Amsterdam.
  • Newey, W. K. and Powell, J. L. (2003). Instrumental variable estimation of nonparametric models. Econometrica 71 1565–1578.
  • Severini, T. A. and Tripathi, G. (2001). A simplified approach to computing efficiency bounds in semiparametric models. J. Econometrics 102 23–66.
  • Städler, N., Bühlmann, P. and van de Geer, S. (2010). $\ell_1$-penalization for mixture regression models. TEST 19 209–256.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • van de Geer, S. A. (2008). High-dimensional generalized linear models and the lasso. Ann. Statist. 36 614–645.
  • Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection. Ann. Statist. 37 2178–2201.
  • Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942.
  • Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • Zhang, C.-H. and Zhang, T. (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statist. Sci. 27 576–593.
  • Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509–1533.
  • Zou, H. and Zhang, H. H. (2009). On the adaptive elastic-net with a diverging number of parameters. Ann. Statist. 37 1733–1751.