Open Access
2022 CDPA: Common and distinctive pattern analysis between high-dimensional datasets
Hai Shu, Zhe Qu
Author Affiliations +
Electron. J. Statist. 16(1): 2475-2517 (2022). DOI: 10.1214/22-EJS2008

Abstract

A representative model in integrative analysis of two high-dimensional correlated datasets is to decompose each data matrix into a low-rank common matrix generated by latent factors shared across datasets, a low-rank distinctive matrix corresponding to each dataset, and an additive noise matrix. Existing decomposition methods claim that their common matrices capture the common pattern of the two datasets. However, their so-called common pattern only denotes the common latent factors but ignores the common pattern between the two coefficient matrices of these common latent factors. We propose a new unsupervised learning method, called the common and distinctive pattern analysis (CDPA), which appropriately defines the two types of data patterns by further incorporating the common and distinctive patterns of the coefficient matrices. A consistent estimation approach is developed for high-dimensional settings, and shows reasonably good finite-sample performance in simulations. Our simulation studies and real data analysis corroborate that the proposed CDPA can provide better characterization of common and distinctive patterns and thereby benefit data mining.

Funding Statement

Dr. Shu’s research was partially supported by the grant R21AG070303 from the National Institutes of Health and a startup fund from New York University.

Citation

Download Citation

Hai Shu. Zhe Qu. "CDPA: Common and distinctive pattern analysis between high-dimensional datasets." Electron. J. Statist. 16 (1) 2475 - 2517, 2022. https://doi.org/10.1214/22-EJS2008

Information

Received: 1 April 2021; Published: 2022
First available in Project Euclid: 4 April 2022

MathSciNet: MR4402971
zbMATH: 07524978
Digital Object Identifier: 10.1214/22-EJS2008

Keywords: Canonical variable , data integration , factor pattern , Graph matching , mixing channel , principal vector

Vol.16 • No. 1 • 2022
Back to Top