The Annals of Statistics

Conditional density estimation in a regression setting

Sam Efromovich
Source: Ann. Statist. Volume 35, Number 6 (2007), 2504-2535.

Abstract

Regression problems are traditionally analyzed via univariate characteristics like the regression function, scale function and marginal density of regression errors. These characteristics are useful and informative whenever the association between the predictor and the response is relatively simple. More detailed information about the association can be provided by the conditional density of the response given the predictor. For the first time in the literature, this article develops the theory of minimax estimation of the conditional density for regression settings with fixed and random designs of predictors, bounded and unbounded responses and a vast set of anisotropic classes of conditional densities. The study of fixed design regression is of special interest and novelty because the known literature is devoted to the case of random predictors. For the aforementioned models, the paper suggests a universal adaptive estimator which (i) matches performance of an oracle that knows both an underlying model and an estimated conditional density; (ii) is sharp minimax over a vast class of anisotropic conditional densities; (iii) is at least rate minimax when the response is independent of the predictor and thus a bivariate conditional density becomes a univariate density; (iv) is adaptive to an underlying design (fixed or random) of predictors.

First Page: Show Hide
Primary Subjects: 62G07
Secondary Subjects: 62C05, 62E20
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1201012970
Digital Object Identifier: doi:10.1214/009053607000000253
Mathematical Reviews number (MathSciNet): MR2382656
Zentralblatt MATH identifier: 1129.62025

References

Arnold, B. C., Castillo, E. and Sarabia, J. M. (1999). Conditional Specification of Statistical Models. Springer, New York.
Mathematical Reviews (MathSciNet): MR1716531
Zentralblatt MATH: 0932.62001
Bashtannyk, D. M. and Hyndman, R. J. (2001). Bandwidth selection for kernel conditional density estimation. Comput. Statist. Data Anal. 36 279--298.
Mathematical Reviews (MathSciNet): MR1836204
Efromovich, S. (1985). Nonparametric estimation of a density with unknown smoothness. Theory Probab. Appl. 30 557--568.
Mathematical Reviews (MathSciNet): MR0805304
Efromovich, S. (1989). On sequential nonparametric estimation of a density. Theory Probab. Appl. 34 228--239.
Mathematical Reviews (MathSciNet): MR1005732
Efromovich, S. (1999). Nonparametric Curve Estimation: Methods, Theory and Applications. Springer, New York.
Mathematical Reviews (MathSciNet): MR1705298
Zentralblatt MATH: 0935.62039
Efromovich, S. (2000). On sharp adaptive estimation of multivariate curves. Math. Methods Statist. 9 117--139.
Mathematical Reviews (MathSciNet): MR1780750
Efromovich, S. (2001). Density estimation under random censorship and order restrictions: From asymptotic to small samples. J. Amer. Statist. Assoc. 96 667--684.
Mathematical Reviews (MathSciNet): MR1946433
Digital Object Identifier: doi:10.1198/016214501753168334
Zentralblatt MATH: 1017.62029
Efromovich, S. (2005). Estimation of the density of regression errors. Ann. Statist. 33 2194--2227.
Mathematical Reviews (MathSciNet): MR2211084
Digital Object Identifier: doi:10.1214/009053605000000435
Project Euclid: euclid.aos/1132936561
Zentralblatt MATH: 1086.62053
Efromovich, S. (2005). Conditional density estimation in a regression setting: Small sample sizes and proofs. Technical report, Univ. New Mexico.
Efromovich, S. (2006). Dimension reduction, optimality and oracle approach in conditional density estimation. Technical report, Univ. Texas at Dallas.
Efromovich, S. (2007). Sequential design and estimation in heteroscedastic nonparametric regression; with discussion. Sequential Anal. 26 3--25.
Mathematical Reviews (MathSciNet): MR2293406
Efromovich, S. and Pinsker, M. S. (1982). Estimation of a square integrable probability density of a random variable. Problems Inform. Transmission 18 175--189.
Mathematical Reviews (MathSciNet): MR0711898
Efromovich, S. and Pinsker M. (1996). Sharp-optimal and adaptive estimation for heteroscedastic nonparametric regression. Statist. Sinica 6 925--942.
Mathematical Reviews (MathSciNet): MR1422411
Fan, J. (1992). Design-adaptive nonparametric regression. J. Amer. Statist. Assoc. 87 998--1004.
Mathematical Reviews (MathSciNet): MR1209561
Digital Object Identifier: doi:10.2307/2290637
Zentralblatt MATH: 0850.62354
Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
Mathematical Reviews (MathSciNet): MR1964455
Zentralblatt MATH: 1014.62103
Fan, J., Yao, Q. and Tong, H. (1996). Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems. Biometrika 83 189--206.
Mathematical Reviews (MathSciNet): MR1399164
Zentralblatt MATH: 0865.62026
Digital Object Identifier: doi:10.1093/biomet/83.1.189
Fan, J. and Yim, T. H. (2004). A cross-validation method for estimating conditional densities. Biometrika 91 819--834.
Mathematical Reviews (MathSciNet): MR2126035
Zentralblatt MATH: 1078.62032
Digital Object Identifier: doi:10.1093/biomet/91.4.819
Golubev, G. K. (1991). Local asymptotic normality in problems of nonparametric estimation of functions, and lower bounds for quadratic risks. Theory Probab. Appl. 36 152--157.
Mathematical Reviews (MathSciNet): MR1109023
Golubev, G. K. (1992). Nonparametric estimation of smooth densities of a distribution in $L_2$. Problems Inform. Transmission 28 44--54.
Mathematical Reviews (MathSciNet): MR1163140
Golubev, G. K. and Levit, B. Y. (1996). Asymptotically efficient estimation for analytic distributions. Math. Methods Statist. 5 357--368.
Mathematical Reviews (MathSciNet): MR1417678
Hall, P., Racine, J. and Li, Q. (2004). Cross-validation and the estimation of conditional probability densities. J. Amer. Statist. Assoc. 99 1015--1026.
Mathematical Reviews (MathSciNet): MR2109491
Digital Object Identifier: doi:10.1198/016214504000000548
Zentralblatt MATH: 1055.62035
Hall, P., Wolff, R. C. L. and Yao, Q. (1999). Methods for estimating a conditional distribution function. J. Amer. Statist. Assoc. 94 154--163.
Mathematical Reviews (MathSciNet): MR1689221
Digital Object Identifier: doi:10.2307/2669691
Zentralblatt MATH: 1072.62558
Hall, P. and Yao, Q. (2005). Approximating conditional distribution functions using dimension reduction. Ann. Statist. 33 1404--1421.
Mathematical Reviews (MathSciNet): MR2195640
Digital Object Identifier: doi:10.1214/009053604000001282
Project Euclid: euclid.aos/1120224107
Zentralblatt MATH: 1072.62008
Hasminskii, R. and Ibragimov, I. (1990). On density estimation in the view of Kolmogorov's ideas in approximation theory. Ann. Statist. 18 999--1010.
Mathematical Reviews (MathSciNet): MR1062695
Digital Object Identifier: doi:10.1214/aos/1176347736
Project Euclid: euclid.aos/1176347736
Hoffmann, M. and Lepski, O. (2002). Random rates in anisotropic regression (with discussion). Ann. Statist. 30 325--396.
Mathematical Reviews (MathSciNet): MR1902892
Digital Object Identifier: doi:10.1214/aos/1021379858
Project Euclid: euclid.aos/1021379858
Zentralblatt MATH: 1012.62042
Hyndman, R. J., Bashtannyk, D. M. and Grunwald, G. K. (1996). Estimating and visualizing conditional densities. J. Comput. Graph. Statist. 5 315--336.
Mathematical Reviews (MathSciNet): MR1422114
Digital Object Identifier: doi:10.2307/1390887
Hyndman, R. J. and Yao, Q. (2002). Nonparametric estimation and symmetry tests for conditional density functions. J. Nonparametr. Statist. 14 259--278.
Mathematical Reviews (MathSciNet): MR1905751
Digital Object Identifier: doi:10.1080/10485250212374
Zentralblatt MATH: 1013.62040
Ibragimov, I. A. and Hasminskii, R. Z. (1983). Estimation of distribution density belonging to a class of entire functions. Theory Probab. Appl. 27 551--562.
Kahane, J.-P. (1985). Some Random Series of Functions, 2nd ed. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR0833073
Zentralblatt MATH: 0571.60002
Kawata, T. (1972). Fourier Analysis in Probability Theory. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR0464353
Zentralblatt MATH: 0271.60022
Neter, J., Kutner, M., Nachtsheim, C. and Wasserman, W. (1996). Applied Linear Statistical Models, 4th ed. McGraw-Hill, Boston.
Zentralblatt MATH: 0347.62043
Nikolskii, S. M. (1975). Approximation of Functions of Several Variables and Imbedding Theorems. Springer, New York.
Mathematical Reviews (MathSciNet): MR0374877
Zentralblatt MATH: 0307.46024
Pinsker, M. S. (1980). Optimal filtering of square integrable signals in Gaussian white noise. Problems Inform. Transmission 16 52--68.
Mathematical Reviews (MathSciNet): MR0624591
Zentralblatt MATH: 0452.94003
Prakasa Rao, B. L. S. (1983). Nonparametric Functional Estimation. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR0740865
Zentralblatt MATH: 0542.62025
Rosenblatt, M. (1969). Conditional probability density and regression estimators. In Multivariate Analysis II (P. R. Krishnaiah, ed.) 25--31. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR0254987
Schipper, M. (1996). Optimal rates and constants in $L_2$-minimax estimation of probability density functions. Math. Methods Statist. 5 253--274.
Mathematical Reviews (MathSciNet): MR1417672

2012 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics