## The Annals of Statistics

### A new scope of penalized empirical likelihood with high-dimensional estimating equations

#### Abstract

Statistical methods with empirical likelihood (EL) are appealing and effective especially in conjunction with estimating equations for flexibly and adaptively incorporating data information. It is known that EL approaches encounter difficulties when dealing with high-dimensional problems. To overcome the challenges, we begin our study with investigating high-dimensional EL from a new scope targeting at high-dimensional sparse model parameters. We show that the new scope provides an opportunity for relaxing the stringent requirement on the dimensionality of the model parameters. Motivated by the new scope, we then propose a new penalized EL by applying two penalty functions respectively regularizing the model parameters and the associated Lagrange multiplier in the optimizations of EL. By penalizing the Lagrange multiplier to encourage its sparsity, a drastic dimension reduction in the number of estimating equations can be achieved. Most attractively, such a reduction in dimensionality of estimating equations can be viewed as a selection among those high-dimensional estimating equations, resulting in a highly parsimonious and effective device for estimating high-dimensional sparse model parameters. Allowing both the dimensionalities of model parameters and estimating equations growing exponentially with the sample size, our theory demonstrates that our new penalized EL estimator is sparse and consistent with asymptotically normally distributed nonzero components. Numerical simulations and a real data analysis show that the proposed penalized EL works promisingly.

#### Article information

Source
Ann. Statist., Volume 46, Number 6B (2018), 3185-3216.

Dates
Revised: October 2017
First available in Project Euclid: 11 September 2018

https://projecteuclid.org/euclid.aos/1536631271

Digital Object Identifier
doi:10.1214/17-AOS1655

Mathematical Reviews number (MathSciNet)
MR3852649

Subjects
Primary: 62G99: None of the above, but in this section
Secondary: 62F40: Bootstrap, jackknife and other resampling methods

#### Citation

Chang, Jinyuan; Tang, Cheng Yong; Wu, Tong Tong. A new scope of penalized empirical likelihood with high-dimensional estimating equations. Ann. Statist. 46 (2018), no. 6B, 3185--3216. doi:10.1214/17-AOS1655. https://projecteuclid.org/euclid.aos/1536631271

#### References

• Bartolucci, F. (2007). A penalized version of the empirical likelihood ratio for the population mean. Statist. Probab. Lett. 77 104–110.
• Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
• Chang, J., Chen, S. X. and Chen, X. (2015). High dimensional generalized empirical likelihood for moment restrictions with dependent data. J. Econometrics 185 283–304.
• Chang, J., Tang, C. Y. and Wu, Y. (2013). Marginal empirical likelihood and sure independence feature screening. Ann. Statist. 41 2123–2148.
• Chang, J., Tang, C. Y. and Wu, Y. (2016). Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood. Ann. Statist. 44 515–539.
• Chang, J., Tang, C. Y. and Wu, T. T. (2018). Supplement to “A new scope of penalized empirical likelihood with high-dimensional estimating equations.” DOI:10.1214/17-AOS1655SUPP.
• Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. In The Handbook of Econometrics, 6B (J. J. Heckman and E. Leamer, eds.). North- Holland, Amsterdam.
• Chen, J. and Chen, Z. (2008). Extended Bayesian information criterion for model selection with large model space. Biometrika 95 759–771.
• Chen, S. X. and Cui, H. (2006). On Bartlett correction of empirical likelihood in the presence of nuisance parameters. Biometrika 93 215–220.
• Chen, S. X. and Cui, H. (2007). On the second-order properties of empirical likelihood with moment restrictions. J. Econometrics 141 492–516.
• Chen, S. X., Peng, L. and Qin, Y. L. (2009). Effects of data dimension on empirical likelihood. Biometrika 96 711–722.
• Chen, X. and Pouzo (2012). Sieve quasi likelihood ratio inference on semi/nonparametric conditional moment models. Econometrica 80 277–321.
• Cheng, X. and Liao, Z. (2015). Select the valid and relevant moments: An information-based LASSO for GMM with many moments. J. Econometrics 186 443–464.
• Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
• Friedman, J., Hastie, T., Hoefling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Stat. 2 302–332.
• Gautier, E. and Tsybakov, A. B. (2014). High-dimensional instrumental variables regression and confidence sets. Manuscript, arXiv:1105.2454v4.
• Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50 1029–1054.
• Hjort, N. L., McKeague, I. W. and Van Keilegom, I. (2009). Extending the scope of empirical likelihood. Ann. Statist. 37 1079–1111.
• Lahiri, S. N. and Mukhopadhyay, S. (2012). A penalized empirical likelihood method in high dimensions. Ann. Statist. 40 2511–2540.
• Leng, C. and Tang, C. Y. (2012). Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika 99 703–716.
• Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
• Lv, J. and Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Ann. Statist. 37 3498–3528.
• Newey, W. K. and Smith, R. J. (2004). Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica 72 219–255.
• Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75 237–249.
• Owen, A. (1990). Empirical likelihood ratio confidence regions. Ann. Statist. 18 90–120.
• Owen, A. (2001). Empirical Likelihood. Chapman & Hall-CRC, New York.
• Petrov, V. V. (1995). Limit Theorems of Probability Theory: Sequences of Independent Random Variables. Oxford Univ. Press, Oxford.
• Qin, J. and Lawless, J. (1994). Empirical likelihood and general estimating equations. Ann. Statist. 22 300–325.
• Qu, A., Lindsay, B. G. and Li, B. (2000). Improving estimating equations using quadratic inference functions. Biometrika 87 823–836.
• Rudin, W. (1976). Principles of Mathematical Analysis. McGraw-Hill, New York.
• Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
• Shi, Z. (2016). Econometric estimation with high-dimensional moment equalities. J. Econometrics 195 104–119.
• Tang, C. Y. and Leng, C. (2010). Penalized high dimensional empirical likelihood. Biometrika 97 905–920.
• Tang, C. Y. and Wu, T. T. (2014). Nested coordinate descent algorithms for empirical likelihood. J. Stat. Comput. Simul. 84 1917–1930.
• Tsao, M. (2004). Bounds on coverage probabilities of the empirical likelihood ratio confidence regions. Ann. Statist. 32 1215–1221.
• Tsao, M. and Wu, F. (2013). Empirical likelihood on the full parameter space. Ann. Statist. 41 2176–2196.
• Tsao, M. and Wu, F. (2014). Extended empirical likelihood for estimating equations. Biometrika 101 703–710.
• Wang, H., Li, B. and Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 671–683.
• Wu, T. T. and Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2 224–244.
• Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942.
• Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.

#### Supplemental materials

• Supplement to “A new scope of penalized empirical likelihood with high-dimensional estimating equations”. Additional technical proofs and a data analysis are given the Supplementary Material.