The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 13, Number 4 (2019), 2213-2234.
Principal nested shape space analysis of molecular dynamics data
Molecular dynamics simulations produce huge datasets of temporal sequences of molecules. It is of interest to summarize the shape evolution of the molecules in a succinct, low-dimensional representation. However, Euclidean techniques such as principal components analysis (PCA) can be problematic as the data may lie far from in a flat manifold. Principal nested spheres gives a fundamentally different decomposition of data from the usual Euclidean subspace based PCA [Biometrika 99 (2012) 551–568]. Subspaces of successively lower dimension are fitted to the data in a backwards manner with the aim of retaining signal and dispensing with noise at each stage. We adapt the methodology to 3D subshape spaces and provide some practical fitting algorithms. The methodology is applied to cluster analysis of peptides, where different states of the molecules can be identified. Also, the temporal transitions between cluster states are explored.
Ann. Appl. Stat., Volume 13, Number 4 (2019), 2213-2234.
Received: September 2018
Revised: March 2019
First available in Project Euclid: 28 November 2019
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Dryden, Ian L.; Kim, Kwang-Rae; Laughton, Charles A.; Le, Huiling. Principal nested shape space analysis of molecular dynamics data. Ann. Appl. Stat. 13 (2019), no. 4, 2213--2234. doi:10.1214/19-AOAS1277. https://projecteuclid.org/euclid.aoas/1574910042