Open Access
May 2014 Scalable Genomics with R and Bioconductor
Michael Lawrence, Martin Morgan
Statist. Sci. 29(2): 214-226 (May 2014). DOI: 10.1214/14-STS476

Abstract

This paper reviews strategies for solving problems encountered when analyzing large genomic data sets and describes the implementation of those strategies in R by packages from the Bioconductor project. We treat the scalable processing, summarization and visualization of big genomic data. The general ideas are well established and include restrictive queries, compression, iteration and parallel computing. We demonstrate the strategies by applying Bioconductor packages to the detection and analysis of genetic variants from a whole genome sequencing experiment.

Citation

Download Citation

Michael Lawrence. Martin Morgan. "Scalable Genomics with R and Bioconductor." Statist. Sci. 29 (2) 214 - 226, May 2014. https://doi.org/10.1214/14-STS476

Information

Published: May 2014
First available in Project Euclid: 18 August 2014

zbMATH: 1332.62009
MathSciNet: MR3264533
Digital Object Identifier: 10.1214/14-STS476

Keywords: big data , Bioconductor , biology , genomics , R

Rights: Copyright © 2014 Institute of Mathematical Statistics

Vol.29 • No. 2 • May 2014
Back to Top