Electronic Journal of Statistics

Noisy independent factor analysis model for density estimation and classification

Umberto Amato, Anestis Antoniadis, Alexander Samarov, and Alexandre B. Tsybakov

Full-text: Open access

Abstract

We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate $(\log^{1/4}{n})/\sqrt{n}$, independent of the dimensionality of the data, as the sample size n tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.

Article information

Source
Electron. J. Statist., Volume 4 (2010), 707-736.

Dates
First available in Project Euclid: 12 August 2010

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1281618584

Digital Object Identifier
doi:10.1214/09-EJS498

Mathematical Reviews number (MathSciNet)
MR2678968

Zentralblatt MATH identifier
1329.62273

Subjects
Primary: 62H25: Factor analysis and principal components; correspondence analysis
Secondary: 62G07: Density estimation 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Keywords
Nonparametric density estimation independent factor analysis aggregation plug-in classifier remote sensing

Citation

Amato, Umberto; Antoniadis, Anestis; Samarov, Alexander; Tsybakov, Alexandre B. Noisy independent factor analysis model for density estimation and classification. Electron. J. Statist. 4 (2010), 707--736. doi:10.1214/09-EJS498. https://projecteuclid.org/euclid.ejs/1281618584


Export citation

References

  • [1] Aladjem, M. (2005). Projection Pursuit Mixture Density Estimation., IEEE Trans. Signal Process. 53 4376–4383.
  • [2] Amato, U., Antoniadis, A., and Grégoire, G. (2003). Independent Component Discriminant Analysis., Int. J. Math. 3 735–753.
  • [3] Anderson, T. W., and Rubin, H. (1956). Statistical inference in factor analysis., Proc. Third Berkeley Symposium on Mathematical Statistics and Probability (Vol. V), ed. J. Neyman. Berkeley and Los Angeles, University of California Press, 111–150.
  • [4] An, Y., Hu, X., and Xu, L. (2006). A comparative investigation on model selection in independent factor analysis., J. Math. Modeling Algorithms 5 447–473.
  • [5] Artiles, L. M. (2001)., Adaptive minimax estimation in classes of smooth functions. University of Utrecht, Ph.D. thesis.
  • [6] Attias, H. (1999). Independent Factor Analysis., Neural Computation 11 803–851.
  • [7] Audibert, J. U., and Tsybakov, A. B. (2007). Fast learning rates for plug-in classifiers., Annals Statist. 35 608–633.
  • [8] Belitser, E., and Levit, B. (2001). Asymptotically local minimax estimation of infinitely smooth density with censored data., Annals Inst. Statist. Math. 53 289–306.
  • [9] Blanchard, B., Kawanabe, G. M., Sugiyama, M., Spokoiny, V., and Müller, K. R. (2006). In search of non-gaussian components of a high-dimensional distribution., J. of Mach. Learn. Research 7 247–282.
  • [10] Cook, R. D., and Li, B. (2002). Dimension reduction for conditional mean in regression., Annals Statist. 32 455–474.
  • [11] Devroye, L., Györfi, L., and Lugosi, G. (1996)., A Probabilistic Theory of Pattern Recognition. New York, Springer.
  • [12] Glad, I. K., Hjort, N. L., and Ushakov, N.G. (2003). Correction of density estimators that are not densities., Scand. J. Statist. 30 415–427.
  • [13] Hall, P., and Murison, R. D. (1993). Correcting the negativity of high-order kernel density estimators., J. Multivar. Analysis 47 103–122.
  • [14] Hastie, T., Tibshirani, R., and Buja, A. (1994). Flexible Discriminant Analysis by Optimal Scoring., J. Am. Statist. Assoc. 89 1255–1270.
  • [15] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables., J. Am. Statist. Assoc. 58 13–30.
  • [16] Hyvarinen, A., Karhunen, J., and Oja, E. (2001)., Independent Component Analysis. New York, Wiley and Sons.
  • [17] Ibragimov, I. A., and Khasminskiĭ, R. Z. (1982). An estimate of the density of a distribution distribution belonging to a class of entire functions (Russian)., Teoriya Veroyatnostei i ee Primeneniya 27 514–524.
  • [18] Juditsky, A. B., Nazin, A. V, Tsybakov, A. B., and Vayatis, N. (2005). Recursive Aggregation of Estimators by the Mirror Descent Algorithm with Averaging., Problems Informat. Transmiss. 41 368–384.
  • [19] Juditsky, A., Rigollet, P., and Tsybakov, A. B. (2008). Learning by mirror averaging., Annals Statist. 36 2183–2206.
  • [20] Kawanabe, M., Sugiyama, M., Blanchard, G., and Müller, K. R. (2007). A new algorithm of non-Gaussian component analysis with radial kernel functions., Annals Inst. Statist. Math. 59 57–75.
  • [21] Kneip, A., and Utikal, K. (2001). Inference for density families using functional principal components analysis (with discussion)., J. Am. Statist. Assoc. 96 519–542.
  • [22] McLachlan, G.J., and Peel, D. (2000)., Finite Mixture Models. New York, Wiley.
  • [23] Montanari, A., Calò, D., and Viroli, C. (2008). Independent factor discriminant analysis., Comput. Statist. Data Anal. 52 3246–3254.
  • [24] Platnick, S., King, M. D., Ackerman, S. A., Menzel, W. P, Baum, P. A., Ridi, J. C, and Frey, R. A. (2003). The MODIS cloud products: Algorithms and examples from Terra., IEEE Trans. Geosc. Remote Sens. 41 459–473.
  • [25] Polzehl, J. (1995). Projection pursuit discriminant analysis., Comput. Statist. Data Anal. 20 141–157.
  • [26] Roweis, S., and Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding., Science 290 2323–2326.
  • [27] Samarov, A., and Tsybakov, A. B. (2004). Nonparametric independent component analysis., Bernoulli 10 565–582.
  • [28] Samarov, A., and Tsybakov, A. B. (2007). Aggregation of density estimators and dimension reduction. In, Advances in Statistical Modeling and Inference, Essays in Honor of K. Doksum, Series in Biostatistics (Vol. 3), V. Nair (ed.). London, World Scientific 233–251.
  • [29] Silverman, B. W. (1982). Kernel density estimation using the fast Fourier transform., Appl. Statist. 31 93–99.
  • [30] Stewart, G. W., Sun, J. (1990), Matrix Perturbation Theory. New York, Academic Press.
  • [31] Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction., Science 290 2319–2323.
  • [32] Titterington, D., A. Smith, and Makov, U. (1985)., Statistical Analysis of Finite Mixture Distributions. New York, Wiley.
  • [33] Tsybakov, A. B. (2009), Introduction to Nonparametric Estimation. New York, Springer.
  • [34] Wand, M. P., and Jones, M. C. (1995)., Kernel Smoothing. London, Chapman & Hall/CRC.
  • [35] Yang, Y. (1999). Minimax nonparametric classification. I. Rates of convergence. II. Model selection for adaptation., IEEE Trans. Inform. Theory 45 2271–2292.