Generalized M-estimators for high-dimensional Tobit I models

Jelena Bradic; Jiaqi Guo

doi:10.1214/18-EJS1463

2019 Generalized M-estimators for high-dimensional Tobit I models

Jelena Bradic, Jiaqi Guo

Electron. J. Statist. 13(1): 582-645 (2019). DOI: 10.1214/18-EJS1463

Abstract

This paper develops robust confidence intervals in high-dimensional and left-censored regression. Type-I censored regression models, where a competing event makes the variable of interest unobservable, are extremely common in practice. In this paper, we develop smoothed estimating equations that are adaptive to censoring level and are more robust to the misspecification of the error distribution. We propose a unified class of robust estimators, including one-step Mallow’s, Schweppe’s, and Hill-Ryan’s estimator that are adaptive to the left-censored observations. In the ultra-high-dimensional setting, where the dimensionality can grow exponentially with the sample size, we show that as long as the preliminary estimator converges faster than $n^{-1/4}$, the one-step estimators inherit asymptotic distribution of fully iterated version. Moreover, we show that the size of the residuals of the Bahadur representation matches those of the pure linear models – that is, the effects of censoring disappear asymptotically. Simulation studies demonstrate that our method is adaptive to the censoring level and asymmetry in the error distribution, and does not lose efficiency when the errors are from symmetric distributions.

References

1.

Takeshi Amemiya. Regression analysis when the dependent variable is truncated normal., Econometrica: Journal of the Econometric Society, pages 997 –1016, 1973. 0282.62061 10.2307/1914031Takeshi Amemiya. Regression analysis when the dependent variable is truncated normal., Econometrica: Journal of the Econometric Society, pages 997 –1016, 1973. 0282.62061 10.2307/1914031

2.

Alexandre Belloni, Victor Chernozhukov, and Kengo Kato. Robust inference in high-dimensional approximately sparse quantile regression models. Technical report, cemmap working paper, Centre for Microdata Methods and Practice, 2013. 06168762 10.3150/11-BEJ410 euclid.bj/1363192037Alexandre Belloni, Victor Chernozhukov, and Kengo Kato. Robust inference in high-dimensional approximately sparse quantile regression models. Technical report, cemmap working paper, Centre for Microdata Methods and Practice, 2013. 06168762 10.3150/11-BEJ410 euclid.bj/1363192037

3.

Alexandre Belloni, Victor Chernozhukov, and Kengo Kato. Uniform post-selection inference for least absolute deviation regression and other z-estimation problems., Biometrika, page asu056, 2014.Alexandre Belloni, Victor Chernozhukov, and Kengo Kato. Uniform post-selection inference for least absolute deviation regression and other z-estimation problems., Biometrika, page asu056, 2014.

4.

Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, and Ying Wei. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework., to appear in the Annals of Statistics, 2017. 1407.62268 10.1214/17-AOS1671 euclid.aos/1536631286Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, and Ying Wei. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework., to appear in the Annals of Statistics, 2017. 1407.62268 10.1214/17-AOS1671 euclid.aos/1536631286

5.

Peter J Bickel. One-step huber estimates in the linear model., Journal of the American Statistical Association, 70(350):428–434, 1975. 0322.62038 10.1080/01621459.1975.10479884Peter J Bickel. One-step huber estimates in the linear model., Journal of the American Statistical Association, 70(350):428–434, 1975. 0322.62038 10.1080/01621459.1975.10479884

6.

Peter J Bickel, Ya’acov Ritov, and Alexandre B Tsybakov. Simultaneous analysis of lasso and dantzig selector., The Annals of Statistics, pages 1705–1732, 2009. 1173.62022 10.1214/08-AOS620 euclid.aos/1245332830Peter J Bickel, Ya’acov Ritov, and Alexandre B Tsybakov. Simultaneous analysis of lasso and dantzig selector., The Annals of Statistics, pages 1705–1732, 2009. 1173.62022 10.1214/08-AOS620 euclid.aos/1245332830

7.

Jelena Bradic, Jianqing Fan, and Weiwei Wang. Penalized composite quasi-likelihood for ultrahigh dimensional variable selection., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3):325–349, 2011. 1411.62181 10.1111/j.1467-9868.2010.00764.xJelena Bradic, Jianqing Fan, and Weiwei Wang. Penalized composite quasi-likelihood for ultrahigh dimensional variable selection., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3):325–349, 2011. 1411.62181 10.1111/j.1467-9868.2010.00764.x

8.

Siddhartha Chib. Bayes inference in the tobit censored regression model., Journal of Econometrics, 51(1–2):79–99, 1992. 0742.62033 10.1016/0304-4076(92)90030-USiddhartha Chib. Bayes inference in the tobit censored regression model., Journal of Econometrics, 51(1–2):79–99, 1992. 0742.62033 10.1016/0304-4076(92)90030-U

9.

Clint W Coakley and Thomas P Hettmansperger. A bounded influence, high breakdown, efficient regression estimator., Journal of the American Statistical Association, 88(423):872–880, 1993. 0783.62024 10.1080/01621459.1993.10476352Clint W Coakley and Thomas P Hettmansperger. A bounded influence, high breakdown, efficient regression estimator., Journal of the American Statistical Association, 88(423):872–880, 1993. 0783.62024 10.1080/01621459.1993.10476352

10.

Jianqing Fan and Runze Li. Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of the American statistical Association, 96(456) :1348–1360, 2001. 1073.62547 10.1198/016214501753382273Jianqing Fan and Runze Li. Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of the American statistical Association, 96(456) :1348–1360, 2001. 1073.62547 10.1198/016214501753382273

11.

Amos Golan, George Judge, and Jeffrey Perloff. Estimation and inference with censored and ordered multinomial response data., Journal of Econometrics, 79(1):23–51, 1997. 0910.62122 10.1016/S0304-4076(97)00006-7Amos Golan, George Judge, and Jeffrey Perloff. Estimation and inference with censored and ordered multinomial response data., Journal of Econometrics, 79(1):23–51, 1997. 0910.62122 10.1016/S0304-4076(97)00006-7

12.

Frank R Hampel. The influence curve and its role in robust estimation., Journal of the american statistical association, 69(346):383–393, 1974. 0305.62031 10.1080/01621459.1974.10482962Frank R Hampel. The influence curve and its role in robust estimation., Journal of the american statistical association, 69(346):383–393, 1974. 0305.62031 10.1080/01621459.1974.10482962

13.

Richard Walter Hill., Robust regression when there are outliers in the carriers. PhD thesis, Harvard University, 1977.Richard Walter Hill., Robust regression when there are outliers in the carriers. PhD thesis, Harvard University, 1977.

14.

Peter J Huber. Robust regression: asymptotics, conjectures and monte carlo., The Annals of Statistics, pages 799–821, 1973. 0289.62033 10.1214/aos/1176342503 euclid.aos/1176342503Peter J Huber. Robust regression: asymptotics, conjectures and monte carlo., The Annals of Statistics, pages 799–821, 1973. 0289.62033 10.1214/aos/1176342503 euclid.aos/1176342503

15.

Adel Javanmard and Andrea Montanari. Hypothesis testing in high-dimensional regression under the gaussian random design model: Asymptotic theory., IEEE Transactions on Information Theory, 60(10) :6522–6554, 2014. 1360.62074 10.1109/TIT.2014.2343629Adel Javanmard and Andrea Montanari. Hypothesis testing in high-dimensional regression under the gaussian random design model: Asymptotic theory., IEEE Transactions on Information Theory, 60(10) :6522–6554, 2014. 1360.62074 10.1109/TIT.2014.2343629

16.

Noureddine El Karoui and Elizabeth Purdom. Can we trust the bootstrap in high-dimension?, arXiv preprint arXiv :1608.00696, 2016. 06982296Noureddine El Karoui and Elizabeth Purdom. Can we trust the bootstrap in high-dimension?, arXiv preprint arXiv :1608.00696, 2016. 06982296

17.

Nicolai Meinshausen and Bin Yu. Lasso-type recovery of sparse representations for high-dimensional data., The Annals of Statistics, pages 246–270, 2009. 1155.62050 10.1214/07-AOS582 euclid.aos/1232115934Nicolai Meinshausen and Bin Yu. Lasso-type recovery of sparse representations for high-dimensional data., The Annals of Statistics, pages 246–270, 2009. 1155.62050 10.1214/07-AOS582 euclid.aos/1232115934

18.

Patric Müller and Sara van de Geer. Censored linear model in high dimensions., Test, 25(1):75–92, 2016. 1341.62218 10.1007/s11749-015-0441-7Patric Müller and Sara van de Geer. Censored linear model in high dimensions., Test, 25(1):75–92, 2016. 1341.62218 10.1007/s11749-015-0441-7

19.

Sahand Negahban, Bin Yu, Martin J Wainwright, and Pradeep K Ravikumar. A unified framework for high-dimensional analysis of $m$-estimators with decomposable regularizers. In, Advances in Neural Information Processing Systems, pages 1348–1356, 2009. 1331.62350 10.1214/12-STS400 euclid.ss/1356098555Sahand Negahban, Bin Yu, Martin J Wainwright, and Pradeep K Ravikumar. A unified framework for high-dimensional analysis of $m$-estimators with decomposable regularizers. In, Advances in Neural Information Processing Systems, pages 1348–1356, 2009. 1331.62350 10.1214/12-STS400 euclid.ss/1356098555

20.

Whitney K Newey and James L Powell. Efficient estimation of linear and type i censored regression models under conditional quantile restrictions., Econometric Theory, 6(03):295–317, 1990.Whitney K Newey and James L Powell. Efficient estimation of linear and type i censored regression models under conditional quantile restrictions., Econometric Theory, 6(03):295–317, 1990.

21.

M. Neykov, Y. Ning, J. S. Liu, and H. Liu. A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations., ArXiv e-prints, October 2015.M. Neykov, Y. Ning, J. S. Liu, and H. Liu. A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations., ArXiv e-prints, October 2015.

22.

Yang Ning and Han Liu. A general theory of hypothesis tests and confidence regions for sparse high dimensional models., Ann. Statist., 45(1):158–195, 02 2017. URL https://doi.org/10.1214/16-AOS1448. 1364.62128 10.1214/16-AOS1448 euclid.aos/1487667620Yang Ning and Han Liu. A general theory of hypothesis tests and confidence regions for sparse high dimensional models., Ann. Statist., 45(1):158–195, 02 2017. URL https://doi.org/10.1214/16-AOS1448. 1364.62128 10.1214/16-AOS1448 euclid.aos/1487667620

23.

James L Powell. Least absolute deviations estimation for the censored regression model., Journal of Econometrics, 25(3):303–325, 1984. 0571.62100 10.1016/0304-4076(84)90004-6James L Powell. Least absolute deviations estimation for the censored regression model., Journal of Econometrics, 25(3):303–325, 1984. 0571.62100 10.1016/0304-4076(84)90004-6

24.

James L Powell. Censored regression quantiles., Journal of econometrics, 32(1):143–155, 1986a. 0605.62139 10.1016/0304-4076(86)90016-3James L Powell. Censored regression quantiles., Journal of econometrics, 32(1):143–155, 1986a. 0605.62139 10.1016/0304-4076(86)90016-3

25.

James L Powell. Symmetrically trimmed least squares estimation for tobit models., Econometrica: journal of the Econometric Society, pages 1435–1460, 1986b. 0625.62048 10.2307/1914308James L Powell. Symmetrically trimmed least squares estimation for tobit models., Econometrica: journal of the Econometric Society, pages 1435–1460, 1986b. 0625.62048 10.2307/1914308

26.

Zhao Ren, Tingni Sun, Cun-Hui Zhang, Harrison H Zhou, et al. Asymptotic normality and optimalities in estimation of large gaussian graphical models., The Annals of Statistics, 43(3):991 –1026, 2015. 1328.62342 10.1214/14-AOS1286 euclid.aos/1431695636Zhao Ren, Tingni Sun, Cun-Hui Zhang, Harrison H Zhou, et al. Asymptotic normality and optimalities in estimation of large gaussian graphical models., The Annals of Statistics, 43(3):991 –1026, 2015. 1328.62342 10.1214/14-AOS1286 euclid.aos/1431695636

27.

Alessandro Rinaldo, Larry Wasserman, Max G’Sell, Jing Lei, and Ryan Tibshirani. Bootstrapping and sample splitting for high-dimensional, assumption-free inference., arXiv preprint arXiv :1611.05401, 2016.Alessandro Rinaldo, Larry Wasserman, Max G’Sell, Jing Lei, and Ryan Tibshirani. Bootstrapping and sample splitting for high-dimensional, assumption-free inference., arXiv preprint arXiv :1611.05401, 2016.

28.

Weixing Song. Distribution-free test in tobit mean regression model., Journal of Statistical Planning and Inference, 141(8) :2891–2901, 2011. 1213.62079 10.1016/j.jspi.2011.03.012Weixing Song. Distribution-free test in tobit mean regression model., Journal of Statistical Planning and Inference, 141(8) :2891–2901, 2011. 1213.62079 10.1016/j.jspi.2011.03.012

29.

Luke C Swenson, Bryan Cobb, Anna Maria Geretti, P Richard Harrigan, Mario Poljak, Carole Seguin-Devaux, Chris Verhofstede, Marc Wirden, Alessandra Amendola, Jurg Boni, et al. Comparative performances of hiv-1 rna load assays at low viral load levels: results of an international collaboration., Journal of clinical microbiology, 52(2):517–523, 2014.Luke C Swenson, Bryan Cobb, Anna Maria Geretti, P Richard Harrigan, Mario Poljak, Carole Seguin-Devaux, Chris Verhofstede, Marc Wirden, Alessandra Amendola, Jurg Boni, et al. Comparative performances of hiv-1 rna load assays at low viral load levels: results of an international collaboration., Journal of clinical microbiology, 52(2):517–523, 2014.

30.

James Tobin. Estimation of relationships for limited dependent variables., Econometrica: journal of the Econometric Society, pages 24–36, 1958. 0088.36607 10.2307/1907382James Tobin. Estimation of relationships for limited dependent variables., Econometrica: journal of the Econometric Society, pages 24–36, 1958. 0088.36607 10.2307/1907382

31.

Sara Van de Geer, Peter Bühlmann, Ya’acov Ritov, Ruben Dezeure, et al. On asymptotically optimal confidence regions and tests for high-dimensional models., The Annals of Statistics, 42(3) :1166–1202, 2014. 1305.62259 10.1214/14-AOS1221 euclid.aos/1403276911Sara Van de Geer, Peter Bühlmann, Ya’acov Ritov, Ruben Dezeure, et al. On asymptotically optimal confidence regions and tests for high-dimensional models., The Annals of Statistics, 42(3) :1166–1202, 2014. 1305.62259 10.1214/14-AOS1221 euclid.aos/1403276911

32.

Sara A Van De Geer, Peter Bühlmann, et al. On the conditions used to prove oracle results for the lasso., Electronic Journal of Statistics, 3 :1360–1392, 2009. 1327.62425 10.1214/09-EJS506Sara A Van De Geer, Peter Bühlmann, et al. On the conditions used to prove oracle results for the lasso., Electronic Journal of Statistics, 3 :1360–1392, 2009. 1327.62425 10.1214/09-EJS506

33.

Aad W Van der Vaart., Asymptotic statistics, volume 3. Cambridge university press, 2000. 1013.62031Aad W Van der Vaart., Asymptotic statistics, volume 3. Cambridge university press, 2000. 1013.62031

34.

Aad W Van Der Vaart and Jon A Wellner. Weak convergence. In, Weak Convergence and Empirical Processes, pages 16–28. Springer, 1996. 0862.60002Aad W Van Der Vaart and Jon A Wellner. Weak convergence. In, Weak Convergence and Empirical Processes, pages 16–28. Springer, 1996. 0862.60002

35.

Cun-Hui Zhang and Stephanie S Zhang. Confidence intervals for low dimensional parameters in high dimensional linear models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1):217–242, 2014. 1411.62196 10.1111/rssb.12026Cun-Hui Zhang and Stephanie S Zhang. Confidence intervals for low dimensional parameters in high dimensional linear models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1):217–242, 2014. 1411.62196 10.1111/rssb.12026

36.

Peng Zhao and Bin Yu. On model selection consistency of lasso., Journal of Machine learning research, 7(Nov) :2541–2563, 2006. 1222.62008Peng Zhao and Bin Yu. On model selection consistency of lasso., Journal of Machine learning research, 7(Nov) :2541–2563, 2006. 1222.62008

37.

Tianqi Zhao, Mladen Kolar, and Han Liu. A general framework for robust testing and confidence regions in high-dimensional quantile regression., arXiv preprint arXiv :1412.8724, 2014a.Tianqi Zhao, Mladen Kolar, and Han Liu. A general framework for robust testing and confidence regions in high-dimensional quantile regression., arXiv preprint arXiv :1412.8724, 2014a.

38.

Yudong Zhao, Bruce M Brown, You-Gan Wang, et al. Smoothed rank-based procedure for censored data., Electronic Journal of Statistics, 8(2) :2953–2974, 2014b. 1311.62168 10.1214/14-EJS975Yudong Zhao, Bruce M Brown, You-Gan Wang, et al. Smoothed rank-based procedure for censored data., Electronic Journal of Statistics, 8(2) :2953–2974, 2014b. 1311.62168 10.1214/14-EJS975

39.

Mikhail Zhelonkin, Marc G Genton, and Elvezio Ronchetti. Robust inference in sample selection models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(4):805–827, 2016. 07065237 10.1111/rssb.12136Mikhail Zhelonkin, Marc G Genton, and Elvezio Ronchetti. Robust inference in sample selection models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(4):805–827, 2016. 07065237 10.1111/rssb.12136

40.

Kenneth Q. Zhou and Stephen L. Portnoy. Direct use of regression quantiles to construct confidence sets in linear models., Ann. Statist., 24(1):287–306, 02 1996. URL http://dx.doi.org/10.1214/aos/1033066210. 0853.62040 10.1214/aos/1033066210 euclid.aos/1033066210Kenneth Q. Zhou and Stephen L. Portnoy. Direct use of regression quantiles to construct confidence sets in linear models., Ann. Statist., 24(1):287–306, 02 1996. URL http://dx.doi.org/10.1214/aos/1033066210. 0853.62040 10.1214/aos/1033066210 euclid.aos/1033066210

Creative Commons Attribution 4.0 International License.

Citation Download Citation

Jelena Bradic and Jiaqi Guo "Generalized M-estimators for high-dimensional Tobit I models," Electronic Journal of Statistics 13(1), 582-645, (2019). https://doi.org/10.1214/18-EJS1463

Received: 1 May 2017; Published: 2019

Access the abstract

JOURNAL ARTICLE
64 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY