Structural Biology and Biocomputing Programme, Spanish National Cancer Center, Madrid, Spain.
PLoS One. 2007 Aug 15;2(8):e737. doi: 10.1371/journal.pone.0000737.
BACKGROUND: Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. METHODOLOGY/PRINCIPAL FINDINGS: ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. CONCLUSIONS/SIGNIFICANCE: ADACGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45x); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects.
背景:基因组 DNA 中的拷贝数改变(CNAs)与包括癌症在内的复杂人类疾病有关。检测 CNAs 的最常见技术之一是基于阵列的比较基因组杂交(aCGH)。aCGH 平台的可用性和识别 CNAs 的需求导致了大量方法学研究的出现。
方法/主要发现:ADaCGH 是一个用于分析 aCGH 数据的 R 包和基于网络的应用程序。它实现了八种用于检测 CNAs、基因组 DNA 增益和损失的方法,包括来自最近两篇综述(CBS、GLAD、CGHseg、HMM)的所有表现最佳的方法。为了提高速度,我们使用并行计算(通过 MPI)。对于个体基因和具有改变拷贝数的基因集,还可以获得其他信息(GO 术语、PubMed 引文、KEGG 和 Reactome 途径)。
结论/意义:ADaCGH 代表了这些类型的应用程序标准的质的提高:a)包括所有表现最佳的算法,而不仅仅是一两个;b)我们不仅限于在现有的 BioConductor 包之上提供 CGI 的薄层,而是仔细使用并行化,检查不同的方案,并能够显著减少用户等待时间(高达 45 倍);c)我们添加了一些方法目前没有的功能,以适应最近的建议(例如,基于小波和 CGHseg 算法的分割结果的合并);d)我们纳入了冗余、容错和检查点,这在基于网络的并行化应用程序中是独一无二的;e)所有代码都在开源许可证下可用,允许在其他软件项目中构建、复制和改编我们的代码。
Bioinformatics. 2010-4-23
BMC Bioinformatics. 2007-9-3
Methods Mol Biol. 2009
Bioinformatics. 2011-8-25
PLoS Comput Biol. 2007-6
J Mol Diagn. 2012-8-23
Nucleic Acids Res. 2010-6-15
Nucleic Acids Res. 2010-5-27
Cancer Inform. 2007-2-10
PLoS Comput Biol. 2007-6
BMC Bioinformatics. 2007-1-10
PLoS Comput Biol. 2006-9-8
Nucleic Acids Res. 2006-7-1
Proc Natl Acad Sci U S A. 2006-3-21
Bioinformatics. 2006-5-1