The Annals of Applied Statistics

Bayesian alignment of similarity shapes

Kanti V. Mardia, Christopher J. Fallaize, Stuart Barber, Richard M. Jackson, and Douglas L. Theobald

Full-text: Open access


We develop a Bayesian model for the alignment of two point configurations under the full similarity transformations of rotation, translation and scaling. Other work in this area has concentrated on rigid body transformations, where scale information is preserved, motivated by problems involving molecular data; this is known as form analysis. We concentrate on a Bayesian formulation for statistical shape analysis. We generalize the model introduced by Green and Mardia [Biometrika 93 (2006) 235–254] for the pairwise alignment of two unlabeled configurations to full similarity transformations by introducing a scaling factor to the model. The generalization is not straightforward, since the model needs to be reformulated to give good performance when scaling is included. We illustrate our method on the alignment of rat growth profiles and a novel application to the alignment of protein domains. Here, scaling is applied to secondary structure elements when comparing protein folds; additionally, we find that one global scaling factor is not in general sufficient to model these data and, hence, we develop a model in which multiple scale factors can be included to handle different scalings of shape components.

Article information

Ann. Appl. Stat. Volume 7, Number 2 (2013), 989-1009.

First available in Project Euclid: 27 June 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Morphometrics protein bioinformatics similarity transformations statistical shape analysis unlabeled shape analysis


Mardia, Kanti V.; Fallaize, Christopher J.; Barber, Stuart; Jackson, Richard M.; Theobald, Douglas L. Bayesian alignment of similarity shapes. Ann. Appl. Stat. 7 (2013), no. 2, 989--1009. doi:10.1214/12-AOAS615.

Export citation


  • Bookstein, F. L. (1991). Morphometric Tools for Landmark Data: Geometry and Biology. Cambridge Univ. Press, Cambridge.
  • Branden, C. and Tooze, J. (1999). Introduction to Protein Structure, 2nd ed. Garland, New York.
  • Creedy, J. and Martin, V. L. (1994). A model for the distribution of prices. Oxford Bulletin of Economics and Statistics 56 67–76.
  • Dryden, I. L., Hirst, J. D. and Melville, J. L. (2007). Statistical analysis of unlabeled point sets: Comparing molecules in chemoinformatics. Biometrics 63 237–251, 315.
  • Dryden, I. L. and Mardia, K. V. (1998). Statistical Shape Analysis. Wiley, Chichester.
  • Green, P. J. and Mardia, K. V. (2006). Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93 235–254.
  • Green, P. J., Mardia, K. V., Nyirongo, V. B. and Ruffieux, Y. (2010). Bayesian modelling for matching and alignment of biomolecules. In The Oxford Handbook of Applied Bayesian Analysis (A. O’Hagan and M. West, eds.) 27–50. Oxford Univ. Press, Oxford.
  • Kenobi, K. and Dryden, I. L. (2012). Bayesian matching of unlabelled point sets using Procrustes and configuration models. Bayesian Anal. 7 547–566.
  • Kenobi, K., Dryden, I. L. and Le, H. (2010). Shape curves and geodesic modelling. Biometrika 97 567–584.
  • Kent, J. T. and Mardia, K. V. (2002). Modelling strategies for spatial-temporal data. In Spatial Cluster Modelling (A. B. Lawson and D. G. T. Denison, eds.) 213–226. Chapman & Hall/CRC, Boca Raton, FL.
  • Kent, J. T., Mardia, K. V. and Taylor, C. C. (2010). Matching unlabelled configurations and protein bioinformatics. Technical report, Univ. Leeds.
  • Kent, J. T., Mardia, K. V., Morris, R. J. and Aykroyd, R. G. (2001). Functional models of growth for landmark data. In Proceedings in Functional and Spatial Data Analysis (K. V. Mardia and R. G. Aykroyd, eds.) 109–115. Leeds Univ. Press, Leeds.
  • Lye, J. and Martin, V. L. (1993). Robust estimation, nonnormalities and generalized exponential distributions. J. Amer. Statist. Assoc. 88 261–267.
  • Mardia, K. V. and Jupp, P. E. (2000). Directional Statistics. Wiley, Chichester.
  • Mardia, K. V. and Nyirongo, V. B. (2012). Bayesian hierarchical alignment methods. In Bayesian Methods in Structural Bioinformatics (T. Hamelryck, K. V. Mardia and J. Ferkinghoff-Borg, eds.) 209–232. Springer, New York.
  • Mardia, K. V., Nyirongo, V. B., Fallaize, C. J., Barber, S. and Jackson, R. M. (2011). Hierarchical Bayesian modelling of pharmacophores in bioinformatics. Biometrics 67 611–619.
  • Mardia, K. V., Fallaize, C. J., Barber, S., Jackson, R. M. and Theobald, D. L. (2013). Supplement to “Bayesian alignment of similarity shapes.” DOI:10.1214/12-AOAS615SUPP.
  • Orengo, C. A., Michie, A. D., Jones, D. T., Swindells, M. B. and Thornton, J. M. (1997). CATH: A hierarchic classification of protein domain structures. Structure 5 1093–1108.
  • Rodriguez, A. and Schmidler, S. (2013). Bayesian protein structural alignment. Ann. Appl. Stat. To appear.
  • Ruffieux, Y. and Green, P. J. (2009). Alignment of multiple configurations using hierarchical models. J. Comput. Graph. Statist. 18 756–773.
  • Schmidler, S. C. (2007). Fast Bayesian shape matching using geometric algorithms. In Bayesian Statistics 8 (J. M. Bernardo, J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. Smith and M. West, eds.) 471–490. Oxford Univ. Press, Oxford.
  • Srivastava, A. and Jermyn, I. H. (2009). Looking for shapes in two-dimensional cluttered point clouds. IEEE Trans. Pattern Anal. Mach. Intell. 31 1616–1629.
  • Taylor, W. R., Thornton, J. M. and Turnell, W. G. (1983). An ellipsoidal approximation of protein shape. Journal of Molecular Graphics 1 30–38.
  • Theobald, D. L. and Wuttke, D. S. (2006). Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem. Proc. Natl. Acad. Sci. USA 103 18521–18527.
  • Wilkinson, D. J. (2007). Discussion of “Fast Bayesian shape matching using geometric algorithms.” In Bayesian Statistics 8 (J. M. Bernardo, J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. Smith and M. West, eds.) 483–487. Oxford Univ. Press, Oxford.
  • Wu, T. D., Schmidler, S. C., Hastie, T. and Brutlag, D. L. (1998). Regression analysis of multiple protein structures. J. Comput. Biol. 5 585–595.

Supplemental materials

  • Supplementary material: Simulation methods and a normal approximation for the halfnormal-gamma distribution. We describe an acceptance-rejection method for simulating from the halfnormal-gamma distribution and investigate its efficiency over a range of parameter settings. We also investigate further the normal approximation to the halfnormal-gamma distribution, which we use to obtain efficient proposals in our Metropolis updates. We show that the approximation is best for parameter values where the acceptance-rejection method is less efficient, and hence that the two simulation methods complement each other well.