References
Bai, J. and Perron, P. (2003). Computation and analysis of multiple structural change models. J. Appl. Econometrics 18 1–22.
Bellman, R. (1961). On the approximation of curves by line segments using dynamic programming. Commun. ACM 4 284.
Benjamini, Y. and Speed, T. (2011). Estimation and correction for GC-content bias in high throughput sequencing. Technical Report 804, Dept. Statistics, Univ. California, Berkeley.
Boeva, V., Zinovyev, A., Bleakley, K., Vert, J.-P., Janoueix-Lerosey, I., Delattre, O. and Barillot, E. (2011). Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. Bioinformatics 27 268–269.
Campbell, P. J., Stephens, P. J., Pleasance, E. D., O’Meara, S., Li, H., Santarius, T., Stebbings, L. A., Leroy, C., Edkins, S., Hardy, C., Teague, J. W., Menzies, A., Goodhead, I., Turner, D. J., Clee, C. M., Quail, M. A., Cox, A., Brown, C., Durbin, R., Hurles, M. E., Edwards, P. A. W., Bignell, G. R., Stratton, M. R. and Futreal, P. A. (2008). Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nature Genetics 40 722–729.
Chen, H., Xing, H. and Zhang, N. R. (2011). Estimation of parent specific DNA copy number in tumors using high-density genotyping arrays. PLoS Comput. Biol. 7 e1001060, 15.
Cheung, M.-S., Down, T. A., Latorre, I. and Ahringer, J. (2011). Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res. 39 e103.
Chiang, D. Y., Getz, G., Jaffe, D. B., O’Kelly, M. J., Zhao, X., Carter, S. L., Russ, C., Nusbaum, C., Meyerson, M. and Lander, E. S. (2009). High-resolution mapping of copy-number alterations with massively parallel sequencing. Nature Methods 6 99–103.
Cobb, G. W. (1978). The problem of the Nile: Conditional solution to a changepoint problem. Biometrika 65 243–251.
Mathematical Reviews (MathSciNet):
MR513930
Conrad, D. F., Andrews, T. D., Carter, N. P., Hurles, M. E. and Pritchard, J. K. (2006). A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet. 38 75–81.
Dohm, J. C., Lottaz, C., Borodina, T. and Himmelbauer, H. (2008). Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 36 e105.
Hinkley, D. V. (1970). Inference about the change-point in a sequence of random variables. Biometrika 57 1–17.
Mathematical Reviews (MathSciNet):
MR273727
Hornik, K. (2005). A CLUE for CLUster Ensembles. Journal of Statistical Software 14.
Hornik, K. (2010). clue: Cluster ensembles R package version 0.3-34.
Ivakhno, S., Royce, T., Cox, A. J., Evers, D. J., Cheetham, R. K. and Tavaré, S. (2010). CNAseg–a novel framework for identification of copy number changes in cancer from second-generation sequencing data. Bioinformatics 26 3051–3058.
Khaja, R., Zhang, J., MacDonald, J. R., He, Y., Joseph-George, A. M., Wei, J., Rafiq, Q. C. M. A., Shago, M., Pantano, L., Aburatani, H., Jones, K., Redon, R., Hurles, M., Armengol, L., Estivill, X., Mural, R. J., Lee, C., Scherer, S. and Feuk, L. (2007). Genome assembly comparison to identify structural variants in the human genome. Nature Genetics 38 1413–1418.
Lai, T. L., Xing, H. and Zhang, N. R. (2007). Stochastic segmentation models for array-based comparative genomic hybridization data analysis. Biostatistics 9 290–307.
Lai, W. R., Johnson, M. D., Kucherlapati, R. and Park, P. J. (2005). Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 21 3763–3770.
Lavielle, M. (2005). Using penalized contrasts for the change-point problem. Signal Processing 85 1501–1510.
Lipson, D., Aumann, Y., Ben-Dor, A., Linial, N. and Yakhini, Z. (2006). Efficient calculation of interval scores for DNA copy number data analysis. J. Comput. Biol. 13 215–228 (electronic).
McCarroll, S. A., Hadnott, T. N., Perry, G. H., Sabeti, P. C., Zody, M. C., Barrett, J. C., Dallaire, S., Gabriel, S. B., Lee, C., Daly, M. J., Altshuler, D. M. and The International HapMap Consortium (2006). Common deletion polymorphisms in the human genome. Nature Genetics 38 86–92.
Medvedev, P., Stanciu, M. and Brudno, M. (2009). Computational methods for discovering structural variation with next-generation sequencing. Nat. Methods 6 S13–S20.
Olshen, A. B., Venkatraman, E. S., Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5 557–572.
Olshen, A. B., Bengtsson, H., Neuvial, P., Spellman, P. T., Olshen, R. A. and Seshan, V. E. (2011). Parent-specific copy number in paired tumor-normal studies using circular binary segmentation. Bioinformatics 27 2038–2046.
Rabinowitz, D. (1994). Detecting clusters in disease incidence. In Change-Point Problems (South Hadley, MA, 1992). Institute of Mathematical Statistics Lecture Notes—Monograph Series 23 255–275. IMS, Hayward, CA.
Redon, R., Ishikawa, S., Fitch, K. R., Feuk, L., Perry, G. H., Andrews, D. T., Fiegler, H., Shapero, M. H., Carson, A. R., Chen, W., Cho, E. K., Dallaire, S., Freeman, J. L., Gonzalez, J. R., Gratacos, M., Huang, J., Kalaitzopoulos, D., Komura, D., Macdonald, J. R., Marshall, C. R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M. J., Tchinda, J., Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Zhang, J., Armengol, L., Conrad, D. F., Estivill, X., Tyler-Smith, C., Carter, N. P., Aburatani, H., Lee, C., Jones, K. W., Scherer, S. W. and Hurles, M. E. (2006). Global variation in copy number in the human genome. Nature 444 444–454.
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
Mathematical Reviews (MathSciNet):
MR468014
Shah, S. P., Lam, W. L., Ng, R. T. and Murphy, K. P. (2007). Modeling recurrent DNA copy number alterations in array CGH data. Bioinformatics 23 450–458.
Siegmund, D. (1988a). Approximate tail probabilities for the maxima of some random fields. Ann. Probab. 16 487–501.
Mathematical Reviews (MathSciNet):
MR929059
Siegmund, D. (1988b). Confidence sets in change-point problems. Internat. Statist. Rev. 56 31–48.
Mathematical Reviews (MathSciNet):
MR963139
Siegmund, D. O., Yakir, B. and Zhang, N. R. (2011). Detecting simultaneous variant intervals in aligned sequences. Ann. Appl. Stat. 5 645–668.
Venkatraman, E. S. and Olshen, A. B. (2007). A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23 657–663.
Walther, G. (2010). Optimal and fast detection of spatial clusters with scan statistics. Ann. Statist. 38 1010–1033.
Wang, P., Kim, Y., Pollack, J., Narasimhan, B. and Tibshirani, R. (2005). A method for calling gains and losses in array-CGH data. Biostatistics 6 45–58.
Willenbrock, H. and Fridlyand, J. (2005). A comparison study: Applying segmentation to arrayCGH data for downstream analyses. Bioinformatics 21 4084–4091.
Xie, C. and Tammi, M. T. (2009). CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics 10 80.
Yoon, S., Xuan, Z., Makarov, V., Ye, K. and Sebat, J. (2009). Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19 1586–1592.
Zhang, N. R. (2010). DNA copy number profiling in normal and tumor genomes. In Frontiers in Computational and Systems Biology (J. Feng, W. Fu and F. Sun, eds.). Computational Biology 15 259–281. Springer, London.
Zhang, N. R. and Siegmund, D. O. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63 22–32, 309.
Zhang, N. R., Siegmund, D. O., Ji, H. and Li, J. Z. (2010). Detecting simultaneous changepoints in multiple sequences. Biometrika 97 631–645.