Communications in Information & Systems
- Commun. Inf. Syst.
- Volume 10, Number 1 (2010), 23-38.
Theory and Algorithms for the Haplotype Assembly Problem
Genome sequencing studies to date have generally sought to assemble consensus genomes by merging sequence contributions from multiple homologous copies of each chromosome. With growing interest in genetic variations, however, there is a need for methods to separate these distinct contributions and assess how individual homologous chromosome copies differ from one another. An approach to this problem was developed using small sequence fragments derived from shotgun sequencing studies to determine the patterns of variations that co-occur on individual chromosomes. This has become known as the "haplotype assembly" problem. This review paper surveys results on the theory and algorithms for haplotype assembly. It first describes common abstractions of the problem. It then discusses some notable intractibility results for different problem variants. It next examines a variety of combinatorial, statistical, and heuristic methods for assembling fragment data sets in practice. The review concludes with a discussion of recent directions in diploid genome sequencing and their implications for haplotype assembly in the future.
Commun. Inf. Syst., Volume 10, Number 1 (2010), 23-38.
First available in Project Euclid: 9 March 2010
Permanent link to this document
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Schwartz, Russell. Theory and Algorithms for the Haplotype Assembly Problem. Commun. Inf. Syst. 10 (2010), no. 1, 23--38. https://projecteuclid.org/euclid.cis/1268143371