The Annals of Statistics

Isotonic regression in general dimensions

Qiyang Han, Tengyao Wang, Sabyasachi Chatterjee, and Richard J. Samworth

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We study the least squares regression function estimator over the class of real-valued functions on $[0,1]^{d}$ that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order $n^{-\min\{2/(d+2),1/d\}}$ in the empirical $L_{2}$ loss, up to polylogarithmic factors. Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise constant on $k$ hyperrectangles, the least squares estimator enjoys a faster, adaptive rate of convergence of $(k/n)^{\min(1,2/d)}$, again up to polylogarithmic factors. Previous results are confined to the case $d\leq2$. Finally, we establish corresponding bounds (which are new even in the case $d=2$) in the more challenging random design setting. There are two surprising features of these results: first, they demonstrate that it is possible for a global empirical risk minimisation procedure to be rate optimal up to polylogarithmic factors even when the corresponding entropy integral for the function class diverges rapidly; second, they indicate that the adaptation rate for shape-constrained estimators can be strictly worse than the parametric rate.

Article information

Source
Ann. Statist., Volume 47, Number 5 (2019), 2440-2471.

Dates
Received: August 2017
Revised: April 2018
First available in Project Euclid: 3 August 2019

Permanent link to this document
https://projecteuclid.org/euclid.aos/1564797853

Digital Object Identifier
doi:10.1214/18-AOS1753

Mathematical Reviews number (MathSciNet)
MR3988762

Subjects
Primary: 62G05: Estimation 62G08: Nonparametric regression 62C20: Minimax procedures

Keywords
Isotonic regression block increasing functions adaptation least squares sharp oracle inequality statistical dimension

Citation

Han, Qiyang; Wang, Tengyao; Chatterjee, Sabyasachi; Samworth, Richard J. Isotonic regression in general dimensions. Ann. Statist. 47 (2019), no. 5, 2440--2471. doi:10.1214/18-AOS1753. https://projecteuclid.org/euclid.aos/1564797853


Export citation

References

  • Amelunxen, D., Lotz, M., McCoy, M. B. and Tropp, J. A. (2014). Living on the edge: Phase transitions in convex programs with random data. Inf. Inference 3 224–294.
  • Bacchetti, P. (1989). Additive isotonic models. J. Amer. Statist. Assoc. 84 289–294.
  • Barlow, R. E., Bartholomew, D. J., Bremner, J. M. and Brunk, H. D. (1972). Statistical Inference Under Order Restrictions. Wiley, New York.
  • Bellec, P. C. (2018). Sharp oracle inequalities for least squares estimators in shape restricted regression. Ann. Statist. 46 745–780.
  • Birgé, L. and Massart, P. (1993). Rates of convergence for minimum contrast estimators. Probab. Theory Related Fields 97 113–150.
  • Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford.
  • Brunk, H. D. (1955). Maximum likelihood estimates of monotone parameters. Ann. Math. Stat. 26 607–616.
  • Chatterjee, S. (2014). A new perspective on least squares under convex constraint. Ann. Statist. 42 2340–2381.
  • Chatterjee, S., Guntuboyina, A. and Sen, B. (2015). On risk bounds in isotonic and other shape restricted regression problems. Ann. Statist. 43 1774–1800.
  • Chatterjee, S., Guntuboyina, A. and Sen, B. (2018). On matrix estimation under monotonicity constraints. Bernoulli 24 1072–1100.
  • Chatterjee, S. and Lafferty, J. (2017). Adaptive risk bounds in unimodal regression. Preprint. Available at arxiv:1512.02956v5.
  • Chen, Y. and Samworth, R. J. (2016). Generalized additive and index models with shape constraints. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 729–754.
  • Donoho, D. (1991). Gelfand $n$-widths and the method of least squares. Technical report, Univ. California, Berkeley, Berkeley, CA.
  • Durot, C. (2007). On the $\mathbb{L}_{p}$-error of monotonicity constrained estimators. Ann. Statist. 35 1080–1104.
  • Durot, C. (2008). Monotone nonparametric regression with random design. Math. Methods Statist. 17 327–341.
  • Dykstra, R. L. (1983). An algorithm for restricted least squares regression. J. Amer. Statist. Assoc. 78 837–842.
  • Dykstra, R. L. and Robertson, T. (1982). An algorithm for isotonic regression for two or more independent variables. Ann. Statist. 10 708–716.
  • Eichler, E. E., Flint, J., Gibson, G., Kong, A., Leal, S. M., Moore, J. H. and Nadeau, J. H. (2010). Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11 446–450.
  • Elena, S. F. and Lenski, R. E. (1997). Test of synergistic interactions among deleterious mutations in bacteria. Nature 390 395–398.
  • Gao, F. and Wellner, J. A. (2007). Entropy estimate for high-dimensional monotonic functions. J. Multivariate Anal. 98 1751–1764.
  • Goldstein, D. B. (2009). Common genetic variation and human traits. N. Engl. J. Med. 360 1696–1698.
  • Groeneboom, P. and Jongbloed, G. (2014). Nonparametric Estimation Under Shape Constraints: Estimators, Algorithms and Asymptotics. Cambridge Series in Statistical and Probabilistic Mathematics 38. Cambridge Univ. Press, New York.
  • Guntuboyina, A. and Sen, B. (2015). Global risk bounds and adaptation in univariate convex regression. Probab. Theory Related Fields 163 379–411.
  • Han, Q., Wang, T., Chatterjee, S. and Samworth, R. J. (2019). Supplement to “Isotonic regression in general dimensions.” DOI:10.1214/18-AOS1753SUPP.
  • Kim, A. K. H., Guntuboyina, A. and Samworth, R. J. (2018). Adaptation in log-concave density estimation. Ann. Statist. 46 2279–2306.
  • Kim, A. K. H. and Samworth, R. J. (2016). Global rates of convergence in log-concave density estimation. Ann. Statist. 44 2756–2779.
  • Kyng, R., Rao, A. and Sachdeva, S. (2015). Fast, provable algorithms for isotonic regression in all $\ell_{p}$-norms. In Advances in Neural Information Processing Systems 2719–2727.
  • Luss, R., Rosset, S. and Shahar, M. (2012). Efficient regularized isotonic regression with application to gene–gene interaction search. Ann. Appl. Stat. 6 253–283.
  • Mammen, E. and Yu, K. (2007). Additive isotone regression. In Asymptotics: Particles, Processes and Inverse Problems (E. A. Cator et al., eds.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 55 179–195. IMS, Beachwood, OH.
  • Mani, R., Onge, R. P. S., Hartman, J. L., Giaever, G. and Roth, F. P. (2008). Defining genetic interaction. Proc. Natl. Acad. Sci. USA 105 3461–3466.
  • Massart, P. (2000). About the constants in Talagrand’s concentration inequalities for empirical processes. Ann. Probab. 28 863–884.
  • Meyer, M. and Woodroofe, M. (2000). On the degrees of freedom in shape-restricted regression. Ann. Statist. 28 1083–1104.
  • Morton-Jones, T., Diggle, P., Parker, L., Dickinson, H. O. and Binks, K. (2000). Additive isotonic regression models in epidemiology. Stat. Med. 19 849–859.
  • Pisier, G. (1989). The Volume of Convex Bodies and Banach Space Geometry. Cambridge Tracts in Mathematics 94. Cambridge Univ. Press, Cambridge.
  • Pollard, D. (2002). A User’s Guide to Measure Theoretic Probability. Cambridge Series in Statistical and Probabilistic Mathematics 8. Cambridge Univ. Press, Cambridge.
  • Rakhlin, A., Sridharan, K. and Tsybakov, A. B. (2017). Empirical entropy, minimax regret and minimax risk. Bernoulli 23 789–824.
  • Romik, D. (2015). The Surprising Mathematics of Longest Increasing Subsequences. Institute of Mathematical Statistics Textbooks 4. Cambridge Univ. Press, New York.
  • Roth, F. P., Lipshitz, H. D. and Andrews, B. J. (2009). Q&A: Epistasis. J. Biol. 8 35.
  • Sanjuán, R. and Elena, S. F. (2006). Epistasis correlates to genomic complexity. Proc. Natl. Acad. Sci. USA 103 14402–14405.
  • Schell, M. J. and Singh, B. (1997). The reduced monotonic regression method. J. Amer. Statist. Assoc. 92 128–135.
  • Shao, H., Burrage, L. C., Sinasac, D. S., Hill, A. E., Ernest, S. R., O’Brien, W., Courtland, H.-W., Jepsen, K. J., Kirby, A., Kulbokas, E. J., Daly, M. J., Broman, K. W., Lander, E. S. and Nadeau, J. H. (2008). Genetic architecture of complex traits: Large phenotypic effects and pervasive epistasis. Proc. Natl. Acad. Sci. USA 105 19910–19914.
  • Stout, Q. F. (2015). Isotonic regression for multiple independent variables. Algorithmica 71 450–470.
  • Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505–563.
  • Tong, A. H., Evangelista, M., Parsons, A. B., Xu, H., Bader, G. D., Pagé, N., Robinson, M., Raghibizadeh, S., Hogue, C. W. V., Bussey, H., Andrews, B., Tyers, M. and Boone, C. (2001). Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294 2364–2368.
  • van de Geer, S. (1990). Estimating a regression function. Ann. Statist. 18 907–924.
  • van de Geer, S. (1993). Hellinger-consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21 14–44.
  • van de Geer, S. A. (2000). Applications of Empirical Process Theory. Cambridge Series in Statistical and Probabilistic Mathematics 6. Cambridge Univ. Press, Cambridge.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
  • van Eeden, C. (1958). Testing and Estimating Ordered Parameters of Probability Distributions. Mathematical Centre, Amsterdam.
  • Yang, F. and Barber, R. F. (2017). Uniform convergence of isotonic regression. Preprint. Available at arxiv:1706.01852.
  • Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.
  • Yu, B. (1997). Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. L. Yang, eds.) 423–435. Springer, New York.
  • Zhang, C.-H. (2002). Risk bounds in isotonic regression. Ann. Statist. 30 528–555.

Supplemental materials

  • Supplementary material to “Isotonic regression in general dimensions”. Auxiliary results.