Open Access
1 February 2000 Comparing DNA Fingerprints of Infectious Organisms
Hugh Salamon, Mark R. Segal, Peter M. Small
Statist. Sci. 15(1): 27-45 (1 February 2000). DOI: 10.1214/ss/1009212672


Genotypes of infectious organisms are becoming the foundation for epidemiologic studies of infectious disease. Central to the use of such data is a means for comparing genotypes. We develop methods for this purpose in the context of DN fingerprint genotyping of tuberculosis, but our approach is applicable to many fingerprint­based genotyping systems and/or organisms. Data available on replicate (laboratory) strains here reveal that (i) error in fingerprint band size is proportional to band size and (ii) errors are positively correlated within a fingerprint. Comparison (or matching) scores computed to account for this error structure need to be “standardized” in order to properly rank the comparisons. We demonstrate the utility of using extreme value distributions to effect such standardization. Several estimation issues for the extreme value parameters are discussed, including a lack of robustness of (approximate) maximum likelihood estimates. Interesting findings to emerge from examination of quantiles of standardized matching scores include (i) formal significance is not attainable when querying a database for a given fingerprint pattern and (ii) maximal matching probabilities are not necessarily monotonely decreasing with increasing numbers of fingerprint bands.


Download Citation

Hugh Salamon. Mark R. Segal. Peter M. Small. "Comparing DNA Fingerprints of Infectious Organisms." Statist. Sci. 15 (1) 27 - 45, 1 February 2000.


Published: 1 February 2000
First available in Project Euclid: 24 December 2001

Digital Object Identifier: 10.1214/ss/1009212672

Keywords: extreme value distribution , genotyping , maximum likelihood estimation , moment estimation , tuberculosis

Rights: Copyright © 2000 Institute of Mathematical Statistics

Vol.15 • No. 1 • February 2000
Back to Top