Suppr超能文献

CMDS:一种基于人群的方法,用于从高分辨率数据中识别癌症中的复发性 DNA 拷贝数异常。

CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data.

机构信息

Division of Statistical Genomics, Washington University School of Medicine, St Louis, MO, USA.

出版信息

Bioinformatics. 2010 Feb 15;26(4):464-9. doi: 10.1093/bioinformatics/btp708. Epub 2009 Dec 23.

Abstract

MOTIVATION

DNA copy number aberration (CNA) is a hallmark of genomic abnormality in tumor cells. Recurrent CNA (RCNA) occurs in multiple cancer samples across the same chromosomal region and has greater implication in tumorigenesis. Current commonly used methods for RCNA identification require CNA calling for individual samples before cross-sample analysis. This two-step strategy may result in a heavy computational burden, as well as a loss of the overall statistical power due to segmentation and discretization of individual sample's data. We propose a population-based approach for RCNA detection with no need of single-sample analysis, which is statistically powerful, computationally efficient and particularly suitable for high-resolution and large-population studies.

RESULTS

Our approach, correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis. Directly using the raw intensity ratio data from all samples and adopting a diagonal transformation strategy, CMDS substantially reduces computational burden and can obtain results very quickly from large datasets. Our simulation indicates that the statistical power of CMDS is higher than that of single-sample CNA calling based two-step approaches. We applied CMDS to two real datasets of lung cancer and brain cancer from Affymetrix and Illumina array platforms, respectively, and successfully identified known regions of CNA associated with EGFR, KRAS and other important oncogenes. CMDS provides a fast, powerful and easily implemented tool for the RCNA analysis of large-scale data from cancer genomes.

摘要

动机

DNA 拷贝数异常(CNA)是肿瘤细胞基因组异常的标志。在同一染色体区域的多个癌症样本中发生的反复 CNA(RCNA)在肿瘤发生中具有更大的意义。目前用于 RCNA 识别的常用方法需要在跨样本分析之前对单个样本进行 CNA 调用。这种两步策略可能会导致计算负担沉重,并且由于个体样本数据的分割和离散化,总体统计能力丧失。我们提出了一种基于群体的 RCNA 检测方法,无需进行单样本分析,该方法具有统计学上的强大性、计算效率高,特别适用于高分辨率和大群体研究。

结果

我们的方法,相关矩阵对角线分割(CMDS),基于染色体间位点的相关分析来识别 RCNAs。CMDS 直接使用所有样本的原始强度比数据,并采用对角线转换策略,大大降低了计算负担,并且可以从大型数据集快速获得结果。我们的模拟表明,CMDS 的统计功效高于基于两步法的单样本 CNA 调用。我们将 CMDS 应用于 Affymetrix 和 Illumina 阵列平台的两个肺癌和脑癌的真实数据集,成功识别了与 EGFR、KRAS 和其他重要癌基因相关的已知 CNA 区域。CMDS 为癌症基因组的大规模数据的 RCNA 分析提供了一种快速、强大且易于实现的工具。

相似文献

引用本文的文献

10
CRCDA--Comprehensive resources for cancer NGS data analysis.CRCDA——癌症NGS数据分析综合资源。
Database (Oxford). 2015 Oct 8;2015. doi: 10.1093/database/bav092. Print 2015.

本文引用的文献

7
Characterizing the cancer genome in lung adenocarcinoma.表征肺腺癌中的癌症基因组。
Nature. 2007 Dec 6;450(7171):893-8. doi: 10.1038/nature06358. Epub 2007 Nov 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验