Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America.
PLoS One. 2011 Jan 31;6(1):e16327. doi: 10.1371/journal.pone.0016327.
Copy number alterations are important contributors to many genetic diseases, including cancer. We present the readDepth package for R, which can detect these aberrations by measuring the depth of coverage obtained by massively parallel sequencing of the genome. In addition to achieving higher accuracy than existing packages, our tool runs much faster by utilizing multi-core architectures to parallelize the processing of these large data sets. In contrast to other published methods, readDepth does not require the sequencing of a reference sample, and uses a robust statistical model that accounts for overdispersed data. It includes a method for effectively increasing the resolution obtained from low-coverage experiments by utilizing breakpoint information from paired end sequencing to do positional refinement. We also demonstrate a method for inferring copy number using reads generated by whole-genome bisulfite sequencing, thus enabling integrative study of epigenomic and copy number alterations. Finally, we apply this tool to two genomes, showing that it performs well on genomes sequenced to both low and high coverage. The readDepth package runs on Linux and MacOSX, is released under the Apache 2.0 license, and is available at http://code.google.com/p/readdepth/.
拷贝数改变是许多遗传疾病(包括癌症)的重要成因。我们为 R 语言提供了 readDepth 包,它可以通过大规模平行测序基因组来测量覆盖深度,从而检测这些异常。与现有的包相比,我们的工具不仅实现了更高的准确性,还利用多核架构来并行处理这些大型数据集,从而大大提高了运行速度。与其他已发表的方法不同,readDepth 不需要参考样本的测序,并使用稳健的统计模型来处理过度分散的数据。它包括一种利用来自配对末端测序的断点信息进行位置精修的方法,从而有效地提高了从低覆盖实验中获得的分辨率。我们还展示了一种使用全基因组亚硫酸氢盐测序生成的读取来推断拷贝数的方法,从而实现了对表观遗传和拷贝数改变的综合研究。最后,我们将此工具应用于两个基因组,表明它在低覆盖和高覆盖测序的基因组上都能很好地运行。readDepth 包在 Linux 和 MacOSX 上运行,根据 Apache 2.0 许可证发布,并可在 http://code.google.com/p/readdepth/ 上获得。