BACOM：在基因组缺失类型的计算检测中，对拷贝数数据中的正常细胞污染进行校正。

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

机构信息

Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

出版信息

Bioinformatics. 2011 Jun 1;27(11):1473-80. doi: 10.1093/bioinformatics/btr183. Epub 2011 Apr 15.

DOI:10.1093/bioinformatics/btr183

PMID:21498400

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3102226/

Abstract

MOTIVATION

Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples.

RESULTS

We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines.

AVAILABILITY

The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm.

摘要

动机

在癌症基因组中识别体细胞 DNA 拷贝数改变 (CNAs) 和显著一致事件 (SCEs) 是发现潜在致癌基因（如癌基因和肿瘤抑制基因）的主要任务。SNP 阵列技术的最新发展促进了在全基因组范围内进行高分辨率拷贝数变化的研究。然而，现有的拷贝数分析方法忽略了正常细胞的污染，无法区分癌症细胞和正常细胞对测量拷贝数信号的贡献。这种污染会极大地混淆 CNA 的下游分析，并影响在临床样本中检测 SCE 的能力。

结果

我们在这里报告了一种基于统计学原理的计算方法，即贝叶斯分析拷贝数混合物 (BACOM)，该方法可以准确估计基因组缺失类型和正常组织污染，并相应地恢复癌细胞中的真实拷贝数谱。我们在两个模拟数据集、两个前列腺癌数据集和癌症基因组图谱高级卵巢数据集上测试了所提出的方法，并得到了非常有前途的结果，这些结果得到了真实数据和生物学合理性的支持。此外，基于大量的比较模拟研究，该方法在对正常组织污染进行计算校正后，显著提高了检测 SCE 的能力。我们开发了一个跨平台的开源 Java 应用程序，该应用程序实现了包括相关处理步骤在内的异质癌症组织拷贝数分析的整个流程。我们还提供了一个 R 接口 bacomR，用于在 R 环境中运行 BACOM，使得它可以很容易地包含在现有的数据管道中。

可用性

跨平台、独立的 Java 应用程序 BACOM、R 接口 bacomR、本文使用的所有源代码和模拟数据都可以在作者的网站上免费获得：http://www.cbil.ece.vt.edu/software.htm。

相似文献

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

Bioinformatics. 2011 Jun 1;27(11):1473-80. doi: 10.1093/bioinformatics/btr183. Epub 2011 Apr 15.

Genome-wide identification of significant aberrations in cancer genome.

BMC Genomics. 2012 Jul 27;13:342. doi: 10.1186/1471-2164-13-342.

BACOM2.0 facilitates absolute normalization and quantification of somatic copy number alterations in heterogeneous tumor.

Sci Rep. 2015 Sep 9;5:13955. doi: 10.1038/srep13955.

AISAIC: a software suite for accurate identification of significant aberrations in cancers.

Bioinformatics. 2014 Feb 1;30(3):431-3. doi: 10.1093/bioinformatics/btt693. Epub 2013 Nov 29.

VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing.

Genome Res. 2012 Mar;22(3):568-76. doi: 10.1101/gr.129684.111. Epub 2012 Feb 2.

Distinguishing somatic and germline copy number events in cancer patient DNA hybridized to whole-genome SNP genotyping arrays.

Methods Mol Biol. 2013;973:355-72. doi: 10.1007/978-1-62703-281-0_22.

TAGCNA: a method to identify significant consensus events of copy number alterations in cancer.

PLoS One. 2012;7(7):e41082. doi: 10.1371/journal.pone.0041082. Epub 2012 Jul 18.

Hierarchical discovery of large-scale and focal copy number alterations in low-coverage cancer genomes.

BMC Bioinformatics. 2020 Apr 16;21(1):147. doi: 10.1186/s12859-020-3480-3.

Minimum error calibration and normalization for genomic copy number analysis.

Genomics. 2020 Sep;112(5):3331-3341. doi: 10.1016/j.ygeno.2020.05.008. Epub 2020 May 13.

CNApp, a tool for the quantification of copy number alterations and integrative analysis revealing clinical implications.

Elife. 2020 Jan 15;9:e50267. doi: 10.7554/eLife.50267.

引用本文的文献

Obtaining spatially resolved tumor purity maps using deep multiple instance learning in a pan-cancer study.

Patterns (N Y). 2021 Dec 9;3(2):100399. doi: 10.1016/j.patter.2021.100399. eCollection 2022 Feb 11.

MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data.

Front Genet. 2020 May 15;11:434. doi: 10.3389/fgene.2020.00434. eCollection 2020.

Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data.

Front Genet. 2020 Apr 30;11:458. doi: 10.3389/fgene.2020.00458. eCollection 2020.

DBS: a fast and informative segmentation algorithm for DNA copy number analysis.

BMC Bioinformatics. 2019 Jan 3;20(1):1. doi: 10.1186/s12859-018-2565-8.

Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.

Cell. 2016 Jul 28;166(3):755-765. doi: 10.1016/j.cell.2016.05.069. Epub 2016 Jun 29.

BACOM2.0 facilitates absolute normalization and quantification of somatic copy number alterations in heterogeneous tumor.

Sci Rep. 2015 Sep 9;5:13955. doi: 10.1038/srep13955.

Integration of Network Biology and Imaging to Study Cancer Phenotypes and Responses.

IEEE/ACM Trans Comput Biol Bioinform. 2014 Nov-Dec;11(6):1009-19. doi: 10.1109/TCBB.2014.2338304. Epub 2014 Jul 16.

UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples.

Bioinformatics. 2015 Jan 1;31(1):137-9. doi: 10.1093/bioinformatics/btu607. Epub 2014 Sep 10.

MethylPurify: tumor purity deconvolution and differential methylation detection from single tumor DNA methylomes.

Genome Biol. 2014 Aug 7;15(8):419. doi: 10.1186/s13059-014-0419-x.

AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data.

Bioinformatics. 2014 Apr 15;30(8):1056-1063. doi: 10.1093/bioinformatics/btt759. Epub 2014 Jan 2.

本文引用的文献

Quantification of normal cell fraction and copy number neutral LOH in clinical lung cancer samples using SNP array data.

PLoS One. 2009 Jun 26;4(6):e6057. doi: 10.1371/journal.pone.0006057.

Copy number analysis indicates monoclonal origin of lethal metastatic prostate cancer.

Nat Med. 2009 May;15(5):559-65. doi: 10.1038/nm.1944. Epub 2009 Apr 12.

SNP arrays in heterogeneous tissue: highly accurate collection of both germline and somatic genetic information from unpaired single tumor samples.

Am J Hum Genet. 2008 Apr;82(4):903-15. doi: 10.1016/j.ajhg.2008.01.012. Epub 2008 Mar 20.

Estimation and assessment of raw copy numbers at the single locus level.

Bioinformatics. 2008 Mar 15;24(6):759-67. doi: 10.1093/bioinformatics/btn016. Epub 2008 Jan 19.

The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.

Nat Rev Cancer. 2008 Jan;8(1):37-49. doi: 10.1038/nrc2294.

Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma.

Proc Natl Acad Sci U S A. 2007 Dec 11;104(50):20007-12. doi: 10.1073/pnas.0710052104. Epub 2007 Dec 6.

A Hidden Markov Model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays.

BMC Bioinformatics. 2007 Nov 9;8:434. doi: 10.1186/1471-2105-8-434.

SiDCoN: a tool to aid scoring of DNA copy number changes in SNP chip data.

PLoS One. 2007 Oct 31;2(10):e1093. doi: 10.1371/journal.pone.0001093.

Highly sensitive method for genomewide detection of allelic composition in nonpaired, primary tumor specimens by use of affymetrix single-nucleotide-polymorphism genotyping microarrays.

Am J Hum Genet. 2007 Jul;81(1):114-26. doi: 10.1086/518809. Epub 2007 Jun 5.

High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping.

Genome Res. 2006 Sep;16(9):1136-48. doi: 10.1101/gr.5402306. Epub 2006 Aug 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BACOM：在基因组缺失类型的计算检测中，对拷贝数数据中的正常细胞污染进行校正。

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

机构信息

Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

出版信息

Bioinformatics. 2011 Jun 1;27(11):1473-80. doi: 10.1093/bioinformatics/btr183. Epub 2011 Apr 15.

DOI:10.1093/bioinformatics/btr183

PMID:21498400

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3102226/

Abstract

MOTIVATION

RESULTS

AVAILABILITY

摘要

动机

结果

可用性

跨平台、独立的 Java 应用程序 BACOM、R 接口 bacomR、本文使用的所有源代码和模拟数据都可以在作者的网站上免费获得：http://www.cbil.ece.vt.edu/software.htm。

BACOM：在基因组缺失类型的计算检测中，对拷贝数数据中的正常细胞污染进行校正。

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

BACOM：在基因组缺失类型的计算检测中，对拷贝数数据中的正常细胞污染进行校正。

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性