Open Access
June 2018 Torus principal component analysis with applications to RNA structure
Benjamin Eltzner, Stephan Huckemann, Kanti V. Mardia
Ann. Appl. Stat. 12(2): 1332-1359 (June 2018). DOI: 10.1214/17-AOAS1115

Abstract

There are several cutting edge applications needing PCA methods for data on tori, and we propose a novel torus-PCA method that adaptively favors low-dimensional representations while preventing overfitting by a new test—both of which can be generally applied and address shortcomings in two previously proposed PCA methods. Unlike tangent space PCA, our torus-PCA features structure fidelity by honoring the cyclic topology of the data space and, unlike geodesic PCA, produces nonwinding, nondense descriptors. These features are achieved by deforming tori into spheres with self-gluing and then using a variant of the recently developed principal nested spheres analysis. This PCA analysis involves a step of subsphere fitting, and we provide a new test to avoid overfitting. We validate our torus-PCA by application to an RNA benchmark data set. Further, using a larger RNA data set, torus-PCA recovers previously found structure, now globally at the one-dimensional representation, which is not accessible via tangent space PCA.

Citation

Download Citation

Benjamin Eltzner. Stephan Huckemann. Kanti V. Mardia. "Torus principal component analysis with applications to RNA structure." Ann. Appl. Stat. 12 (2) 1332 - 1359, June 2018. https://doi.org/10.1214/17-AOAS1115

Information

Received: 1 March 2017; Revised: 1 July 2017; Published: June 2018
First available in Project Euclid: 28 July 2018

zbMATH: 06980496
MathSciNet: MR3834306
Digital Object Identifier: 10.1214/17-AOAS1115

Keywords: dihedral angles , Dimension reduction , directional statistics , fitting small spheres , principal nested spheres analysis , Statistics on manifolds , tori deformation

Rights: Copyright © 2018 Institute of Mathematical Statistics

Vol.12 • No. 2 • June 2018
Back to Top