Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment

Irina Czogiel; Ian L. Dryden; Christopher J. Brignell

doi:10.1214/11-AOAS486

December 2011 Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment

Irina Czogiel, Ian L. Dryden, Christopher J. Brignell

Ann. Appl. Stat. 5(4): 2603-2629 (December 2011). DOI: 10.1214/11-AOAS486

Abstract

Statistical methodology is proposed for comparing unlabeled marked point sets, with an application to aligning steroid molecules in chemoinformatics. Methods from statistical shape analysis are combined with techniques for predicting random fields in spatial statistics in order to define a suitable measure of similarity between two marked point sets. Bayesian modeling of the predicted field overlap between pairs of point sets is proposed, and posterior inference of the alignment is carried out using Markov chain Monte Carlo simulation. By representing the fields in reproducing kernel Hilbert spaces, the degree of overlap can be computed without expensive numerical integration. Superimposing entire fields rather than the configuration matrices of point coordinates thereby avoids the problem that there is usually no clear one-to-one correspondence between the points. In addition, mask parameters are introduced in the model, so that partial matching of the marked point sets can be carried out. We also propose an adaptation of the generalized Procrustes analysis algorithm for the simultaneous alignment of multiple point sets. The methodology is illustrated with a simulation study and then applied to a data set of 31 steroid molecules, where the relationship between shape and binding activity to the corticosteroid binding globulin receptor is explored.

Citation

Download Citation

Irina Czogiel. Ian L. Dryden. Christopher J. Brignell. "Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment." Ann. Appl. Stat. 5 (4) 2603 - 2629, December 2011. https://doi.org/10.1214/11-AOAS486

Information

Published: December 2011

First available in Project Euclid: 20 December 2011

zbMATH: 1234.62141

MathSciNet: MR2907128

Digital Object Identifier: 10.1214/11-AOAS486

Keywords: Bioinformatics , chemoinformatics , kriging , Markov chain Monte Carlo , Procrustes , ‎reproducing kernel Hilbert ‎space , shape , size , spatial , steroids

Access the abstract

JOURNAL ARTICLE
27 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY