Open Access
September 2010 Sparse logistic principal components analysis for binary data
Seokho Lee, Jianhua Z. Huang, Jianhua Hu
Ann. Appl. Stat. 4(3): 1579-1601 (September 2010). DOI: 10.1214/10-AOAS327


We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization–Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study.


Download Citation

Seokho Lee. Jianhua Z. Huang. Jianhua Hu. "Sparse logistic principal components analysis for binary data." Ann. Appl. Stat. 4 (3) 1579 - 1601, September 2010.


Published: September 2010
First available in Project Euclid: 18 October 2010

zbMATH: 1202.62084
MathSciNet: MR2758342
Digital Object Identifier: 10.1214/10-AOAS327

Keywords: Binary data , Dimension reduction , Lasso , MM algorithm , PCA , regularization , Sparsity

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.4 • No. 3 • September 2010
Back to Top