## Bernoulli

• Bernoulli
• Volume 25, Number 4B (2019), 3276-3310.

### Least squares estimation in the monotone single index model

#### Abstract

We study the monotone single index model where a real response variable $Y$ is linked to a $d$-dimensional covariate $X$ through the relationship $E[Y|X]=\Psi_{0}(\alpha^{T}_{0}X)$, almost surely. Both the ridge function, $\Psi_{0}$, and the index parameter, $\alpha_{0}$, are unknown and the ridge function is assumed to be monotone. Under some appropriate conditions, we show that the rate of convergence in the $L_{2}$-norm for the least squares estimator of the bundled function $\Psi_{0}({\alpha}^{T}_{0}\cdot)$ is $n^{1/3}$. A similar result is established for the isolated ridge function, and the index is shown to converge at least at the rate $n^{1/3}$. Since the least squares estimator of the index is computationally intensive, we also consider alternative estimators of the index $\alpha_{0}$ from earlier literature. Moreover, we show that if the rate of convergence of such an alternative estimator is at least $n^{1/3}$, then the corresponding least-squares type estimators (obtained via a “plug-in” approach) of both the bundled and isolated ridge functions still converge at the rate $n^{1/3}$.

#### Article information

Source
Bernoulli, Volume 25, Number 4B (2019), 3276-3310.

Dates
Revised: August 2018
First available in Project Euclid: 25 September 2019

https://projecteuclid.org/euclid.bj/1569398766

Digital Object Identifier
doi:10.3150/18-BEJ1090

Mathematical Reviews number (MathSciNet)
MR4010955

Zentralblatt MATH identifier
07110138

#### Citation

Balabdaoui, Fadoua; Durot, Cécile; Jankowski, Hanna. Least squares estimation in the monotone single index model. Bernoulli 25 (2019), no. 4B, 3276--3310. doi:10.3150/18-BEJ1090. https://projecteuclid.org/euclid.bj/1569398766

#### References

• [1] Balabdaoui, F., Durot, C. and Jankowski, H. (2019). Supplement to “Least squares estimation in the monotone single index model.” DOI:10.3150/18-BEJ1090SUPP.
• [2] Barlow, R.E., Bartholomew, D.J., Bremner, J.M. and Brunk, H.D. (1972). Statistical Inference Under Order Restrictions. The Theory and Application of Isotonic Regression. London–Sydney: Wiley. Wiley Series in Probability and Mathematical Statistics.
• [3] Brillinger, D.R. (1983). A generalized linear model with “Gaussian” regressor variables. In A Festschrift for Erich L. Lehmann. Wadsworth Statist./Probab. Ser. 97–114. Belmont, CA: Wadsworth.
• [4] Chen, Y. and Samworth, R.J. (2016). Generalized additive and index models with shape constraints. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 729–754.
• [5] Chiou, J.-M. and Müller, H.-G. (2004). Quasi-likelihood regression with multiple indices and smooth link and variance functions. Scand. J. Stat. 31 367–386.
• [6] Chmielewski, M.A. (1981). Elliptically symmetric distributions: A review and bibliography. Int. Stat. Rev. 49 67–74.
• [7] Cosslett, S.R. (1983). Distribution-free maximum likelihood estimator of the binary choice model. Econometrica 51 765–782.
• [8] Cover, T.M. (1967). The number of linearly inducible orderings of points in $d$-space. SIAM J. Appl. Math. 15 434–439.
• [9] Dobson, A.J. and Barnett, A.G. (2008). An Introduction to Generalized Linear Models, 3rd ed. Texts in Statistical Science Series. Boca Raton, FL: CRC Press.
• [10] Feige, U. and Schechtman, G. (2002). On the optimality of the random hyperplane rounding technique for MAX CUT. Random Structures Algorithms 20 403–440. Probabilistic methods in combinatorial optimization.
• [11] Foster, J.C., Taylor, J.M.G. and Nan, B. (2013). Variable selection in monotone single-index models via the adaptive LASSO. Stat. Med. 32 3944–3954.
• [12] Ganti, R., Rao, N., Willett, R.M. and Nowak, R. (2015). Learning single index models in high dimensions. Preprint. Available at arXiv:1506.08910.
• [13] Goldstein, L., Minsker, S. and Wei, X. (2018). Structured signal recovery from non-linear and heavy-tailed measurements. IEEE Trans. Inform. Theory 64 5513–5530.
• [14] Groeneboom, P. and Hendrickx, K. (2018). Current status linear regression. Ann. Statist. 46 1415–1444.
• [15] Han, A.K. (1987). Nonparametric analysis of a generalized regression model. The maximum rank correlation estimator. J. Econometrics 35 303–316.
• [16] Härdle, W., Hall, P. and Ichimura, H. (1993). Optimal smoothing in single-index models. Ann. Statist. 21 157–178.
• [17] Hristache, M., Juditsky, A. and Spokoiny, V. (2001). Direct estimation of the index coefficient in a single-index model. Ann. Statist. 29 595–623.
• [18] Kakade, S.M., Kanade, V., Shamir, O. and Kalai, A. (2011). Efficient learning of generalized linear and single index models with isotonic regression. In Advances in Neural Information Processing Systems 927–935.
• [19] Kalai, A. and Sastry, R. (2009). The isotron algorithm: High-dimensional isotonic regression. In Proceedings of the 22nd Annual Conference on Learning Theory (COLT).
• [20] Li, K.-C. and Duan, N. (1989). Regression analysis under link violation. Ann. Statist. 17 1009–1052.
• [21] Li, Q. and Racine, J.S. (2007). Nonparametric Econometrics: Theory and Practice. Princeton, NJ: Princeton Univ. Press.
• [22] Lin, W. and Kulasekera, K.B. (2007). Identifiability of single-index models and additive-index models. Biometrika 94 496–501.
• [23] Murphy, S.A., van der Vaart, A.W. and Wellner, J.A. (1999). Current status regression. Math. Methods Statist. 8 407–425.
• [24] Plan, Y. and Vershynin, R. (2013). Robust 1-bit compressed sensing and sparse logistic regression: A convex programming approach. IEEE Trans. Inform. Theory 59 482–494.
• [25] Plan, Y., Vershynin, R. and Yudovina, E. (2017). High-dimensional estimation with geometric constraints. Inf. Inference 6 1–40.
• [26] van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics. New York: Springer.

#### Supplemental materials

• Supplement to “Least squares estimation in the monotone single index model”. We provide additional proofs, we give an algorithm to compute the LSE exactly for the special case when $d=2$, we give properties of exponential families, and we provide additional simulations for Section 5.