The Annals of Statistics

On the bias in estimating genetic length and other quantities in simplex constrained models

Arthur Cohen, J.H.B. Kemperman, and Harold Sackrowitz

Full-text: Open access

Abstract

The genetic distance between two loci on a chromosome is defined as the mean number of crossovers between the loci. The parameters of the crossover distribution are constrained by the parameters of the distribution of chiasmata. Ott (1996) derived the maximum likelihood estimator (MLE) of the parameters of the crossover distribution and the MLE of the mean. We demonstrate that the MLE of the mean is pointwise less than or equal to the empirical mean number of crossovers. It follows that the MLE is negatively biased. For small sample sizes the bias can be nonnegligible. We recommend reduced bias estimators.

Generalizations to many other problems involving linear constraints on parameters are made. Included in the generalizations are a variety of problems involving simplex constraints as studied recently by Liu (2000).

Article information

Source
Ann. Statist., Volume 30, Number 1 (2002), 202-219.

Dates
First available in Project Euclid: 5 March 2002

Permanent link to this document
https://projecteuclid.org/euclid.aos/1015362190

Digital Object Identifier
doi:10.1214/aos/1015362190

Mathematical Reviews number (MathSciNet)
MR1892661

Zentralblatt MATH identifier
1012.62114

Subjects
Primary: 92D10: Genetics {For genetic algebras, see 17D92} 62F10: Point estimation

Keywords
Crossovers chiasma maximum likelihood estimation order-restricted inference nonlinear programming

Citation

Cohen, Arthur; Kemperman, J.H.B.; Sackrowitz, Harold. On the bias in estimating genetic length and other quantities in simplex constrained models. Ann. Statist. 30 (2002), no. 1, 202--219. doi:10.1214/aos/1015362190. https://projecteuclid.org/euclid.aos/1015362190


Export citation

References

  • ANDERSON, T. W. (1971). The Statistical Analysis of Time Series. Wiley, New York.
  • COHEN, A., KEMPERMAN, J. H. B. and SACKROWITZ, H. B. (1994). Unbiased testing in exponential family regression. Ann. Statist. 22 1931-1946.
  • LEE, C. C. (1988). Quadratic loss of order restricted estimators for treatment means with a control. Ann. Statist. 16 751-758.
  • LIU, C. (2000). Estimation of discrete distributions with a class of simplex constraints. J. Amer. Statist. Assoc. 95 109-120.
  • MATHER, K. (1933). The relation between chiasmata and crossing-over in diploid and triploid Drosophila melanogaster. J. Genetics 27 243-259.
  • MATHER, K. (1938). Crossing-over. Biol. Rev. Cambridge Philos. Soc. 13 252-292.
  • OTT, J. (1996). Estimating crossover frequencies and testing for numerical interference with highly polymorphic markers. In Genetic Mapping and DNA Sequencing (T. Speed and M. S. Waterman, eds.) 49-63. Springer, New York.
  • ROBERTSON, T., WRIGHT, F. T. and DYKSTRA, R. L. (1988). Order Restricted Statistical Inference. Wiley, New York.
  • YU, K. and FEINGOLD, E. (2001). Estimating the frequency distribution of crossovers during meiosis from recombination data. Biometrics 57 427-434.
  • ZANGWILL, W. I. and MOND, B. (1969). Nonlinear Programming: A Unified Approach. Prentice- Hall, Englewood Cliffs, NJ.
  • PISCATAWAY, NEW JERSEY E-MAIL: artcohen@rci.rutgers.edu