Open Access
March 2012 A stochastic algorithm for probabilistic independent component analysis
Stéphanie Allassonnière, Laurent Younes
Ann. Appl. Stat. 6(1): 125-160 (March 2012). DOI: 10.1214/11-AOAS499
Abstract

The decomposition of a sample of images on a relevant subspace is a recurrent problem in many different fields from Computer Vision to medical image analysis. We propose in this paper a new learning principle and implementation of the generative decomposition model generally known as noisy ICA (for independent component analysis) based on the SAEM algorithm, which is a versatile stochastic approximation of the standard EM algorithm. We demonstrate the applicability of the method on a large range of decomposition models and illustrate the developments with experimental results on various data sets.

References

1.

Akaike, H. (2003). A new look at the statistical model identification. IEEE Trans. Automat. Control 19 716–723. MR423716 10.1109/TAC.1974.1100705Akaike, H. (2003). A new look at the statistical model identification. IEEE Trans. Automat. Control 19 716–723. MR423716 10.1109/TAC.1974.1100705

2.

Allassonnière, S., Amit, Y. and Trouvé, A. (2007). Towards a coherent statistical framework for dense deformable template estimation. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 3–29. MR2301497Allassonnière, S., Amit, Y. and Trouvé, A. (2007). Towards a coherent statistical framework for dense deformable template estimation. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 3–29. MR2301497

3.

Allassonnière, S., Kuhn, E. and Trouvé, A. (2008). MAP estimation of statistical deformable templates via nonlinear mixed effects models: Deterministic and stochastic approaches. In Proc. of the International Workshop on the Mathematical Foundations of Computational Anatomy (MFCA), New York (X. Pennec and S. Joshi, eds.) 80–91. Available at  http://www.inria.fr/sophia/asclepios/events/MFCA08/Proceedings/MFCA08_Proceedings.pdf.Allassonnière, S., Kuhn, E. and Trouvé, A. (2008). MAP estimation of statistical deformable templates via nonlinear mixed effects models: Deterministic and stochastic approaches. In Proc. of the International Workshop on the Mathematical Foundations of Computational Anatomy (MFCA), New York (X. Pennec and S. Joshi, eds.) 80–91. Available at  http://www.inria.fr/sophia/asclepios/events/MFCA08/Proceedings/MFCA08_Proceedings.pdf.

4.

Allassonnière, S. and Kuhn, E. (2010). Stochastic algorithm for Bayesian mixture effect template estimation. ESAIM Probab. Stat. 14 382–408. MR2795472 05873002 10.1051/ps/2009001Allassonnière, S. and Kuhn, E. (2010). Stochastic algorithm for Bayesian mixture effect template estimation. ESAIM Probab. Stat. 14 382–408. MR2795472 05873002 10.1051/ps/2009001

5.

Allassonnière, S., Kuhn, E. and Trouvé, A. (2010). Construction of Bayesian deformable models via a stochastic approximation algorithm: A convergence study. Bernoulli 16 641–678. MR2730643 10.3150/09-BEJ229 euclid.bj/1281099879 Allassonnière, S., Kuhn, E. and Trouvé, A. (2010). Construction of Bayesian deformable models via a stochastic approximation algorithm: A convergence study. Bernoulli 16 641–678. MR2730643 10.3150/09-BEJ229 euclid.bj/1281099879

6.

Allassonnière, S. and Younes, L. (2011). Supplement to “A stochastic algorithm for probabilistic independent component analysis.”  DOI:10.1214/11-AOAS499SUPP.Allassonnière, S. and Younes, L. (2011). Supplement to “A stochastic algorithm for probabilistic independent component analysis.”  DOI:10.1214/11-AOAS499SUPP.

7.

Andrieu, C., Moulines, É. and Priouret, P. (2005). Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44 283–312. MR2177157 1083.62073 10.1137/S0363012902417267Andrieu, C., Moulines, É. and Priouret, P. (2005). Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44 283–312. MR2177157 1083.62073 10.1137/S0363012902417267

8.

Arie, Y. (2002). Non-orthogonal joint diagonalization in the least-squares sense with application in blind source separation. IEEE Trans. Signal Process 50 1545–1553. MR1931239 10.1109/TSP.2002.1011195Arie, Y. (2002). Non-orthogonal joint diagonalization in the least-squares sense with application in blind source separation. IEEE Trans. Signal Process 50 1545–1553. MR1931239 10.1109/TSP.2002.1011195

9.

Attias, H. (1999). Independent factor analysis. Neural Comput. 11 803–851.Attias, H. (1999). Independent factor analysis. Neural Comput. 11 803–851.

10.

Bach, F. and Jordan, I. M. (2003). Kernel independent component analysis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hong Kong, China. Available at  http://www.di.ens.fr/~fbach/kernelICA-icassp03.pdf.Bach, F. and Jordan, I. M. (2003). Kernel independent component analysis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hong Kong, China. Available at  http://www.di.ens.fr/~fbach/kernelICA-icassp03.pdf.

11.

Bartlett, M. S., Movellan, J. R. and Sejnowski, T. J. (2002). Face recognition by independent component analysis. IEEE Trans. Neural Netw. 13 1450–1464.Bartlett, M. S., Movellan, J. R. and Sejnowski, T. J. (2002). Face recognition by independent component analysis. IEEE Trans. Neural Netw. 13 1450–1464.

12.

Bell, A. J. and Sejnowski, T. J. (1995a). An information maximisation approach to blind separation and blind deconvolution. Neural Comput. 7 1004–1034.Bell, A. J. and Sejnowski, T. J. (1995a). An information maximisation approach to blind separation and blind deconvolution. Neural Comput. 7 1004–1034.

13.

Bell, A. J. and Sejnowski, T. J. (1995b). An information maximisation approach to blind separation and blind deconvolution. Neural Comput. 7, 6 1129–1159.Bell, A. J. and Sejnowski, T. J. (1995b). An information maximisation approach to blind separation and blind deconvolution. Neural Comput. 7, 6 1129–1159.

14.

Brandt Petersen, K. and Winther, O. (2005). The EM algorithm in independent component analysis. In Proc. of the ICASSP Conference 169–172. IEEE, Philadelphia, PA.Brandt Petersen, K. and Winther, O. (2005). The EM algorithm in independent component analysis. In Proc. of the ICASSP Conference 169–172. IEEE, Philadelphia, PA.

15.

Bremond, O., Moulines, É. and Cardoso, J.-F. (1997). Séparation et déconvolution aveugle de signaux bruités: Modélisatin par mélange de gaussiennes. GRETSI, Grenoble 1427–1430.Bremond, O., Moulines, É. and Cardoso, J.-F. (1997). Séparation et déconvolution aveugle de signaux bruités: Modélisatin par mélange de gaussiennes. GRETSI, Grenoble 1427–1430.

16.

Calhoun, V., Adali, T. and McGinty, V. (2001). fMRI activation in a visual-perception task: Network of areas detected using the general linear model and independent components analysis. NeuroImage 14 1080–1088.Calhoun, V., Adali, T. and McGinty, V. (2001). fMRI activation in a visual-perception task: Network of areas detected using the general linear model and independent components analysis. NeuroImage 14 1080–1088.

17.

Calhoun, V. D., Adali, T., Pearlson, G. D. and Pekar, J. J. (2001). A method for making group inferences from functional MRI data using independent component analysis. Hum. Brain Mapp. 14 140–151.Calhoun, V. D., Adali, T., Pearlson, G. D. and Pekar, J. J. (2001). A method for making group inferences from functional MRI data using independent component analysis. Hum. Brain Mapp. 14 140–151.

18.

Cardoso, J.-F. (1999). High-order contrasts for independent component analysis. Neural Comput. 11 157–192.Cardoso, J.-F. (1999). High-order contrasts for independent component analysis. Neural Comput. 11 157–192.

19.

Celeux, G. and Diebolt, J. (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comp. Statis. Quaterly 2 73–82.Celeux, G. and Diebolt, J. (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comp. Statis. Quaterly 2 73–82.

20.

Côme, E., Cherfi, Z., Oukhellou, L. and Aknin, P. (2008). Semi-supervised IFA with prior knowledge on the mixing process. An application to railway device diagnosis. In Proc. of the International Conference on Machine Learning and Applications 415–420. IEEE, Washington, DC.Côme, E., Cherfi, Z., Oukhellou, L. and Aknin, P. (2008). Semi-supervised IFA with prior knowledge on the mixing process. An application to railway device diagnosis. In Proc. of the International Conference on Machine Learning and Applications 415–420. IEEE, Washington, DC.

21.

Delyon, B., Lavielle, M. and Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Ann. Statist. 27 94–128. MR1701103 0932.62094 10.1214/aos/1018031103 euclid.aos/1018031103 Delyon, B., Lavielle, M. and Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Ann. Statist. 27 94–128. MR1701103 0932.62094 10.1214/aos/1018031103 euclid.aos/1018031103

22.

Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499. MR2060166 1091.62054 10.1214/009053604000000067 euclid.aos/1083178935 Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499. MR2060166 1091.62054 10.1214/009053604000000067 euclid.aos/1083178935

23.

Eriksson, J., Karvanen, J. and Koivunen, V. (2000). Source distribution adaptive maximum likelihood estimation of ICA model. In Proc. of 2nd International Workshop on Independent Component Analysis and Blind Signal Separation, Helsinki 227–232.Eriksson, J., Karvanen, J. and Koivunen, V. (2000). Source distribution adaptive maximum likelihood estimation of ICA model. In Proc. of 2nd International Workshop on Independent Component Analysis and Blind Signal Separation, Helsinki 227–232.

24.

Farid, H. and Adelson, E. (1999). Separating reflections and lighting using independent components analysis. In IEEE Conference on Computer Vision and Pattern Recognition, Fort Collins, CO.Farid, H. and Adelson, E. (1999). Separating reflections and lighting using independent components analysis. In IEEE Conference on Computer Vision and Pattern Recognition, Fort Collins, CO.

25.

Grimes, D. B. and Rao, R. P. N. (2005). Bilinear sparse coding for invariant vision. Neural Comput. 17 47–73.Grimes, D. B. and Rao, R. P. N. (2005). Bilinear sparse coding for invariant vision. Neural Comput. 17 47–73.

26.

Grimes, D. B., Shon, A. P. and Rao, R. P. N. (2003). Probabilistic bilinear models for appearance-based vision. In Proc. of the Ninth IEEE International Conference on Computer Vision (ICCV’03), Beijing, China 2 1478–1486.Grimes, D. B., Shon, A. P. and Rao, R. P. N. (2003). Probabilistic bilinear models for appearance-based vision. In Proc. of the Ninth IEEE International Conference on Computer Vision (ICCV’03), Beijing, China 2 1478–1486.

27.

Hyvarinen, A. (1999). Survey on independent component analysis. Neural Computing Surveys 2 94–128.Hyvarinen, A. (1999). Survey on independent component analysis. Neural Computing Surveys 2 94–128.

28.

Hyvärinen, A. and Oja, E. (1997). A fast fixed-point algorithm for independent component analysis. Neural Comput. 9 1483–1492.Hyvärinen, A. and Oja, E. (1997). A fast fixed-point algorithm for independent component analysis. Neural Comput. 9 1483–1492.

29.

Kagan, A. M., Linnik, Y. V. and Rao, C. R. (1973). Characterization Problems in Mathematical Statistics. Wiley, New York. MR346969 0271.62002Kagan, A. M., Linnik, Y. V. and Rao, C. R. (1973). Characterization Problems in Mathematical Statistics. Wiley, New York. MR346969 0271.62002

30.

Kuhn, E. and Lavielle, M. (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM Probab. Stat. 8 115–131 (electronic). MR2085610 1155.62420 10.1051/ps:2004007Kuhn, E. and Lavielle, M. (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM Probab. Stat. 8 115–131 (electronic). MR2085610 1155.62420 10.1051/ps:2004007

31.

Learned-Miller, E. G. and Fisher III, J. W. (2003). ICA using spacings estimates of entropy. J. Mach. Learn. Res. 4 1271–1295. MR2103630 1061.62007Learned-Miller, E. G. and Fisher III, J. W. (2003). ICA using spacings estimates of entropy. J. Mach. Learn. Res. 4 1271–1295. MR2103630 1061.62007

32.

Li, D. and Sun, X. (2006). Nonlinear Integer Programming. International Ser. Operations Res. Management Sci. 84. Springer, New York. MR2220900Li, D. and Sun, X. (2006). Nonlinear Integer Programming. International Ser. Operations Res. Management Sci. 84. Springer, New York. MR2220900

33.

Liebermeister, W. (2002). Linear modes of gene expression determined by independent component analysis. Bioinformatics 18 51–60.Liebermeister, W. (2002). Linear modes of gene expression determined by independent component analysis. Bioinformatics 18 51–60.

34.

Liu, C. and Wechsler, H. (2003). Independent component analysis of Gabor features for face recognition. IEEE Trans. Neural Netw. 4 919–928.Liu, C. and Wechsler, H. (2003). Independent component analysis of Gabor features for face recognition. IEEE Trans. Neural Netw. 4 919–928.

35.

Makeig, S. and Jung, T. (1997). Blind separation of auditory event-related brain responses into independent components. Proc. Natl. Acad. Sci. USA 94 10979–10984.Makeig, S. and Jung, T. (1997). Blind separation of auditory event-related brain responses into independent components. Proc. Natl. Acad. Sci. USA 94 10979–10984.

36.

Maugis, C., Celeux, G. and Martin-Magniette, M. L. (2009). Variable selection in model-based clustering: A general variable role modeling. Comput. Statist. Data Anal. 53 3872–3882. MR2749931Maugis, C., Celeux, G. and Martin-Magniette, M. L. (2009). Variable selection in model-based clustering: A general variable role modeling. Comput. Statist. Data Anal. 53 3872–3882. MR2749931

37.

Miller, M. I., Trouve, A. and Younes, L. (2002). On the metrics and Euler–Lagrange equations of computational anatomy. Annu. Rev. Biomed. Eng. 4 375–405.Miller, M. I., Trouve, A. and Younes, L. (2002). On the metrics and Euler–Lagrange equations of computational anatomy. Annu. Rev. Biomed. Eng. 4 375–405.

38.

Miller, M. I., Trouvé, A. and Younes, L. (2006). Geodesic shooting for computational anatomy. J. Math. Imaging Vision 24 209–228. MR2227097 10.1007/s10851-005-3624-0Miller, M. I., Trouvé, A. and Younes, L. (2006). Geodesic shooting for computational anatomy. J. Math. Imaging Vision 24 209–228. MR2227097 10.1007/s10851-005-3624-0

39.

Miller, M. I., Priebe, C. E., Qiu, A., Fischl, B., Kolasny, A., Brown, T., Park, Y., Ratnanather, J. T., Busa, E., Jovicich, J., Yu, P., Dickerson, B. C. and Buckner, R. L. (2009). Morphometry BIRN. Collaborative computational anatomy: An MRI morphometry study of the human brain via diffeomorphic metric mapping. Hum. Brain Mapp. 30 2132–2141.Miller, M. I., Priebe, C. E., Qiu, A., Fischl, B., Kolasny, A., Brown, T., Park, Y., Ratnanather, J. T., Busa, E., Jovicich, J., Yu, P., Dickerson, B. C. and Buckner, R. L. (2009). Morphometry BIRN. Collaborative computational anatomy: An MRI morphometry study of the human brain via diffeomorphic metric mapping. Hum. Brain Mapp. 30 2132–2141.

40.

Miskin, J. W. and MacKay, D. J. C. (2000). Ensemble learning for blind source separation and deconvolution. In Advances in Independent Component Analysis: Principle and Practice (M. Girolami, ed.) 209–233. Springer, Berlin.Miskin, J. W. and MacKay, D. J. C. (2000). Ensemble learning for blind source separation and deconvolution. In Advances in Independent Component Analysis: Principle and Practice (M. Girolami, ed.) 209–233. Springer, Berlin.

41.

Moulines, E., cois Cardoso, J.-F. and Gassiat, E. (1997). Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models. In International Conf. Acoustics, Speech, and Signal Processing ICASSP-97 Munich, Germany 5 3617–3620.Moulines, E., cois Cardoso, J.-F. and Gassiat, E. (1997). Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models. In International Conf. Acoustics, Speech, and Signal Processing ICASSP-97 Munich, Germany 5 3617–3620.

42.

Olshausen, B. A. and Field, D. J. (1996a). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381 607–609.Olshausen, B. A. and Field, D. J. (1996a). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381 607–609.

43.

Olshausen, B. A. and Field, D. J. (1996b). Natural images statistics and efficient coding. Networks: Computation in Neural Systems 7 333–339.Olshausen, B. A. and Field, D. J. (1996b). Natural images statistics and efficient coding. Networks: Computation in Neural Systems 7 333–339.

44.

Scholz, M., Gatzek, S., Sterling, A., Fiehn, O. and Selbig, J. (2004). Metabolite fingerprinting: Detecting biological features by independent component analysis. Bioinformatics 20 2447–2454.Scholz, M., Gatzek, S., Sterling, A., Fiehn, O. and Selbig, J. (2004). Metabolite fingerprinting: Detecting biological features by independent component analysis. Bioinformatics 20 2447–2454.

45.

Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464. MR468014 0379.62005 10.1214/aos/1176344136 euclid.aos/1176344136 Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464. MR468014 0379.62005 10.1214/aos/1176344136 euclid.aos/1176344136

46.

Tanner, M. A. (1996). Tools for Statistical Inference. Springer, New York. MR1396311Tanner, M. A. (1996). Tools for Statistical Inference. Springer, New York. MR1396311

47.

Tenenbaum, J. B. and Freeman, W. T. (2002). Separating style and content with bilinear models. Neural Comput. 12 1247–1283.Tenenbaum, J. B. and Freeman, W. T. (2002). Separating style and content with bilinear models. Neural Comput. 12 1247–1283.

48.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288. MR1379242Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288. MR1379242

49.

Trouvé, A. (1998). Diffeomorphism groups and pattern matching in image analysis. Int. J. Comput. Vis. 28 213–221.Trouvé, A. (1998). Diffeomorphism groups and pattern matching in image analysis. Int. J. Comput. Vis. 28 213–221.

50.

Trouvé, A. and Younes, L. (2002). Local geometry of deformable templates. Technical report, Univ. Paris 13.Trouvé, A. and Younes, L. (2002). Local geometry of deformable templates. Technical report, Univ. Paris 13.

51.

Üzümcü, M., Frangi, A. F., Reiber, J. H. C. and Lelieveldt, B. P. F. (2003). Independent component analysis in statistical shape models. SPIE Medical Image Analysis 375–383.Üzümcü, M., Frangi, A. F., Reiber, J. H. C. and Lelieveldt, B. P. F. (2003). Independent component analysis in statistical shape models. SPIE Medical Image Analysis 375–383.

52.

Valpola Lappalainen, H. and Pajunen, P. (2000). Fast algorithms for Bayesian independent component analysis. In Proc. of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, Helsinki, Finland 233–237.Valpola Lappalainen, H. and Pajunen, P. (2000). Fast algorithms for Bayesian independent component analysis. In Proc. of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, ICA 2000, Helsinki, Finland 233–237.

53.

Varoquaux, G., Sadaghini, S., Poline, J. B. and Thirion, B. (2010). A group model for stable multi-subject ICA on fMRI datasets. NeuroImage 51 288—299.Varoquaux, G., Sadaghini, S., Poline, J. B. and Thirion, B. (2010). A group model for stable multi-subject ICA on fMRI datasets. NeuroImage 51 288—299.

54.

Wang, L., Miller, J. P., Gado, M. H., McKeel, D. W., Rothermich, M., Miller, M. I., Morris, J. C. and Csernansky, J. G. (2006). Abnormalities of hippocampal surface structure in very mild dementia of the Alzheimer type. Neuroimage 30 52–60.Wang, L., Miller, J. P., Gado, M. H., McKeel, D. W., Rothermich, M., Miller, M. I., Morris, J. C. and Csernansky, J. G. (2006). Abnormalities of hippocampal surface structure in very mild dementia of the Alzheimer type. Neuroimage 30 52–60.

55.

Wei, G. C. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Amer. Statist. Assoc. 85 699–704.Wei, G. C. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Amer. Statist. Assoc. 85 699–704.

56.

Welling, M. and Weber, M. (2001). A constrained EM algorithm for independent component analysis. Neural Comput. 13 677–689.Welling, M. and Weber, M. (2001). A constrained EM algorithm for independent component analysis. Neural Comput. 13 677–689.
Copyright © 2012 Institute of Mathematical Statistics
Stéphanie Allassonnière and Laurent Younes "A stochastic algorithm for probabilistic independent component analysis," The Annals of Applied Statistics 6(1), 125-160, (March 2012). https://doi.org/10.1214/11-AOAS499
Published: March 2012
Vol.6 • No. 1 • March 2012
Back to Top