Open Access
December 2014 Bayesian protein structure alignment
Abel Rodriguez, Scott C. Schmidler
Ann. Appl. Stat. 8(4): 2068-2095 (December 2014). DOI: 10.1214/14-AOAS780

Abstract

The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key “gap” parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence–structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.

Citation

Download Citation

Abel Rodriguez. Scott C. Schmidler. "Bayesian protein structure alignment." Ann. Appl. Stat. 8 (4) 2068 - 2095, December 2014. https://doi.org/10.1214/14-AOAS780

Information

Published: December 2014
First available in Project Euclid: 19 December 2014

zbMATH: 06408770
MathSciNet: MR3292489
Digital Object Identifier: 10.1214/14-AOAS780

Keywords: affine gap , dynamic programming , Procrustes distance , Protein alignment , structure alignment

Rights: Copyright © 2014 Institute of Mathematical Statistics

Vol.8 • No. 4 • December 2014
Back to Top