Abstract
In this paper, we study high-dimensional sparse Quadratic Discriminant Analysis (QDA) and aim to establish the optimal convergence rates for the classification error. Minimax lower bounds are established to demonstrate the necessity of structural assumptions such as sparsity conditions on the discriminating direction and differential graph for the possible construction of consistent high-dimensional QDA rules.
We then propose a classification algorithm called SDAR using constrained convex optimization under the sparsity assumptions. Both minimax upper and lower bounds are obtained and this classification rule is shown to be simultaneously rate optimal over a collection of parameter spaces, up to a logarithmic factor. Simulation studies demonstrate that SDAR performs well numerically. The algorithm is also illustrated through an analysis of prostate cancer data and colon tissue data. The methodology and theory developed for high-dimensional QDA for two groups in the Gaussian setting are also extended to multigroup classification and to classification under the Gaussian copula model.
Funding Statement
The first author was supported by NSF Grant DMS-1712735 and NIH Grant R01 GM-123056.
The second author was supported by NSF Grant DMS-2015378.
Acknowledgments
The authors would like to thank the anonymous referees, an Associate Editor and the editor for their constructive comments that improved the quality of this paper.
Citation
T. Tony Cai. Linjun Zhang. "A convex optimization approach to high-dimensional sparse quadratic discriminant analysis." Ann. Statist. 49 (3) 1537 - 1568, June 2021. https://doi.org/10.1214/20-AOS2012
Information