Electronic Journal of Statistics

Noisy independent factor analysis model for density estimation and classification

Umberto Amato, Anestis Antoniadis, Alexander Samarov, and Alexandre B. Tsybakov
Source: Electron. J. Statist. Volume 4 (2010), 707-736.

Abstract

We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate $(\log^{1/4}{n})/\sqrt{n}$, independent of the dimensionality of the data, as the sample size n tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.

First Page: Show Hide
Primary Subjects: 62H25
Secondary Subjects: 62G07, 62H30
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ejs/1281618584
Digital Object Identifier: doi:10.1214/09-EJS498
Mathematical Reviews number (MathSciNet): MR2678968

References

[1] Aladjem, M. (2005). Projection Pursuit Mixture Density Estimation., IEEE Trans. Signal Process. 53 4376–4383.
Mathematical Reviews (MathSciNet): MR2242178
Digital Object Identifier: doi:10.1109/TSP.2005.857007
[2] Amato, U., Antoniadis, A., and Grégoire, G. (2003). Independent Component Discriminant Analysis., Int. J. Math. 3 735–753.
Mathematical Reviews (MathSciNet): MR1975044
Zentralblatt MATH: 1190.62115
[3] Anderson, T. W., and Rubin, H. (1956). Statistical inference in factor analysis., Proc. Third Berkeley Symposium on Mathematical Statistics and Probability (Vol. V), ed. J. Neyman. Berkeley and Los Angeles, University of California Press, 111–150.
Mathematical Reviews (MathSciNet): MR84943
Zentralblatt MATH: 0070.14703
[4] An, Y., Hu, X., and Xu, L. (2006). A comparative investigation on model selection in independent factor analysis., J. Math. Modeling Algorithms 5 447–473.
Mathematical Reviews (MathSciNet): MR2244277
Zentralblatt MATH: 1107.62336
Digital Object Identifier: doi:10.1007/s10852-005-9021-2
[5] Artiles, L. M. (2001)., Adaptive minimax estimation in classes of smooth functions. University of Utrecht, Ph.D. thesis.
[6] Attias, H. (1999). Independent Factor Analysis., Neural Computation 11 803–851.
[7] Audibert, J. U., and Tsybakov, A. B. (2007). Fast learning rates for plug-in classifiers., Annals Statist. 35 608–633.
Mathematical Reviews (MathSciNet): MR2336861
Zentralblatt MATH: 1118.62041
Digital Object Identifier: doi:10.1214/009053606000001217
Project Euclid: euclid.aos/1183667286
[8] Belitser, E., and Levit, B. (2001). Asymptotically local minimax estimation of infinitely smooth density with censored data., Annals Inst. Statist. Math. 53 289–306.
Mathematical Reviews (MathSciNet): MR1841137
Zentralblatt MATH: 0998.62026
Digital Object Identifier: doi:10.1023/A:1012418722154
[9] Blanchard, B., Kawanabe, G. M., Sugiyama, M., Spokoiny, V., and Müller, K. R. (2006). In search of non-gaussian components of a high-dimensional distribution., J. of Mach. Learn. Research 7 247–282.
[10] Cook, R. D., and Li, B. (2002). Dimension reduction for conditional mean in regression., Annals Statist. 32 455–474.
Mathematical Reviews (MathSciNet): MR1902895
Zentralblatt MATH: 1012.62035
Digital Object Identifier: doi:10.1214/aos/1021379861
Project Euclid: euclid.aos/1021379861
[11] Devroye, L., Györfi, L., and Lugosi, G. (1996)., A Probabilistic Theory of Pattern Recognition. New York, Springer.
Mathematical Reviews (MathSciNet): MR1383093
Zentralblatt MATH: 0853.68150
[12] Glad, I. K., Hjort, N. L., and Ushakov, N.G. (2003). Correction of density estimators that are not densities., Scand. J. Statist. 30 415–427.
Mathematical Reviews (MathSciNet): MR1983134
Digital Object Identifier: doi:10.1111/1467-9469.00339
[13] Hall, P., and Murison, R. D. (1993). Correcting the negativity of high-order kernel density estimators., J. Multivar. Analysis 47 103–122.
Mathematical Reviews (MathSciNet): MR1239108
Zentralblatt MATH: 0778.62034
Digital Object Identifier: doi:10.1006/jmva.1993.1073
[14] Hastie, T., Tibshirani, R., and Buja, A. (1994). Flexible Discriminant Analysis by Optimal Scoring., J. Am. Statist. Assoc. 89 1255–1270.
Mathematical Reviews (MathSciNet): MR1310220
Zentralblatt MATH: 0812.62067
Digital Object Identifier: doi:10.2307/2290989
[15] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables., J. Am. Statist. Assoc. 58 13–30.
Mathematical Reviews (MathSciNet): MR144363
Zentralblatt MATH: 0127.10602
Digital Object Identifier: doi:10.2307/2282952
[16] Hyvarinen, A., Karhunen, J., and Oja, E. (2001)., Independent Component Analysis. New York, Wiley and Sons.
[17] Ibragimov, I. A., and Khasminskiĭ, R. Z. (1982). An estimate of the density of a distribution distribution belonging to a class of entire functions (Russian)., Teoriya Veroyatnostei i ee Primeneniya 27 514–524.
Mathematical Reviews (MathSciNet): MR673923
[18] Juditsky, A. B., Nazin, A. V, Tsybakov, A. B., and Vayatis, N. (2005). Recursive Aggregation of Estimators by the Mirror Descent Algorithm with Averaging., Problems Informat. Transmiss. 41 368–384.
[19] Juditsky, A., Rigollet, P., and Tsybakov, A. B. (2008). Learning by mirror averaging., Annals Statist. 36 2183–2206.
Mathematical Reviews (MathSciNet): MR2458184
Zentralblatt MATH: 05368488
Digital Object Identifier: doi:10.1214/07-AOS546
Project Euclid: euclid.aos/1223908089
[20] Kawanabe, M., Sugiyama, M., Blanchard, G., and Müller, K. R. (2007). A new algorithm of non-Gaussian component analysis with radial kernel functions., Annals Inst. Statist. Math. 59 57–75.
[21] Kneip, A., and Utikal, K. (2001). Inference for density families using functional principal components analysis (with discussion)., J. Am. Statist. Assoc. 96 519–542.
Mathematical Reviews (MathSciNet): MR1946423
Zentralblatt MATH: 1019.62060
Digital Object Identifier: doi:10.1198/016214501753168235
[22] McLachlan, G.J., and Peel, D. (2000)., Finite Mixture Models. New York, Wiley.
Mathematical Reviews (MathSciNet): MR1789474
[23] Montanari, A., Calò, D., and Viroli, C. (2008). Independent factor discriminant analysis., Comput. Statist. Data Anal. 52 3246–3254.
[24] Platnick, S., King, M. D., Ackerman, S. A., Menzel, W. P, Baum, P. A., Ridi, J. C, and Frey, R. A. (2003). The MODIS cloud products: Algorithms and examples from Terra., IEEE Trans. Geosc. Remote Sens. 41 459–473.
[25] Polzehl, J. (1995). Projection pursuit discriminant analysis., Comput. Statist. Data Anal. 20 141–157.
Mathematical Reviews (MathSciNet): MR1353784
[26] Roweis, S., and Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding., Science 290 2323–2326.
[27] Samarov, A., and Tsybakov, A. B. (2004). Nonparametric independent component analysis., Bernoulli 10 565–582.
Mathematical Reviews (MathSciNet): MR2076063
Digital Object Identifier: doi:10.3150/bj/1093265630
Project Euclid: euclid.bj/1093265630
[28] Samarov, A., and Tsybakov, A. B. (2007). Aggregation of density estimators and dimension reduction. In, Advances in Statistical Modeling and Inference, Essays in Honor of K. Doksum, Series in Biostatistics (Vol. 3), V. Nair (ed.). London, World Scientific 233–251.
Mathematical Reviews (MathSciNet): MR2416118
[29] Silverman, B. W. (1982). Kernel density estimation using the fast Fourier transform., Appl. Statist. 31 93–99.
[30] Stewart, G. W., Sun, J. (1990), Matrix Perturbation Theory. New York, Academic Press.
Mathematical Reviews (MathSciNet): MR1061154
[31] Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction., Science 290 2319–2323.
[32] Titterington, D., A. Smith, and Makov, U. (1985)., Statistical Analysis of Finite Mixture Distributions. New York, Wiley.
Mathematical Reviews (MathSciNet): MR838090
Zentralblatt MATH: 0646.62013
[33] Tsybakov, A. B. (2009), Introduction to Nonparametric Estimation. New York, Springer.
Mathematical Reviews (MathSciNet): MR2724359
[34] Wand, M. P., and Jones, M. C. (1995)., Kernel Smoothing. London, Chapman & Hall/CRC.
Mathematical Reviews (MathSciNet): MR1319818
[35] Yang, Y. (1999). Minimax nonparametric classification. I. Rates of convergence. II. Model selection for adaptation., IEEE Trans. Inform. Theory 45 2271–2292.
Mathematical Reviews (MathSciNet): MR1725115
Digital Object Identifier: doi:10.1109/18.796368

2012 © Institute of Mathematical Statistics

Electronic Journal of Statistics

Electronic Journal of Statistics