Suppr超能文献

在非参数贝叶斯框架下通过综合基因组学方法发现癌症驱动基因。

Cancer driver gene discovery through an integrative genomics approach in a non-parametric Bayesian framework.

作者信息

Yang Hai, Wei Qiang, Zhong Xue, Yang Hushan, Li Bingshan

机构信息

Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, USA.

Vanderbilt Genetics Institute, Nashville, TN, USA.

出版信息

Bioinformatics. 2017 Feb 15;33(4):483-490. doi: 10.1093/bioinformatics/btw662.

Abstract

MOTIVATION

Comprehensive catalogue of genes that drive tumor initiation and progression in cancer is key to advancing diagnostics, therapeutics and treatment. Given the complexity of cancer, the catalogue is far from complete yet. Increasing evidence shows that driver genes exhibit consistent aberration patterns across multiple-omics in tumors. In this study, we aim to leverage complementary information encoded in each of the omics data to identify novel driver genes through an integrative framework. Specifically, we integrated mutations, gene expression, DNA copy numbers, DNA methylation and protein abundance, all available in The Cancer Genome Atlas (TCGA) and developed iDriver, a non-parametric Bayesian framework based on multivariate statistical modeling to identify driver genes in an unsupervised fashion. iDriver captures the inherent clusters of gene aberrations and constructs the background distribution that is used to assess and calibrate the confidence of driver genes identified through multi-dimensional genomic data.

RESULTS

We applied the method to 4 cancer types in TCGA and identified candidate driver genes that are highly enriched with known drivers. (e.g.: P < 3.40 × 10 -36 for breast cancer). We are particularly interested in novel genes and observed multiple lines of supporting evidence. Using systematic evaluation from multiple independent aspects, we identified 45 candidate driver genes that were not previously known across these 4 cancer types. The finding has important implications that integrating additional genomic data with multivariate statistics can help identify cancer drivers and guide the next stage of cancer genomics research.

AVAILABILITY AND IMPLEMENTATION

The C ++ source code is freely available at https://medschool.vanderbilt.edu/cgg/ .

CONTACTS

hai.yang@vanderbilt.edu or bingshan.li@Vanderbilt.Edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

全面列出驱动癌症发生和发展的基因目录是推进癌症诊断、治疗和疗法的关键。鉴于癌症的复杂性,该目录目前还远未完整。越来越多的证据表明,驱动基因在肿瘤的多种组学中呈现出一致的畸变模式。在本研究中,我们旨在利用每个组学数据中编码的互补信息,通过一个整合框架来识别新的驱动基因。具体而言,我们整合了《癌症基因组图谱》(TCGA)中所有可用的突变、基因表达、DNA拷贝数、DNA甲基化和蛋白质丰度数据,并开发了iDriver,这是一个基于多变量统计建模的非参数贝叶斯框架,用于以无监督方式识别驱动基因。iDriver捕捉基因畸变的固有聚类,并构建背景分布,用于评估和校准通过多维基因组数据识别出的驱动基因的可信度。

结果

我们将该方法应用于TCGA中的4种癌症类型,识别出了高度富集已知驱动基因的候选驱动基因(例如:乳腺癌的P < 3.40×10-36)。我们对新基因特别感兴趣,并观察到了多条支持证据。通过从多个独立方面进行系统评估,我们在这4种癌症类型中识别出了45个以前未知的候选驱动基因。这一发现具有重要意义,即整合额外的基因组数据和多变量统计可以帮助识别癌症驱动基因,并指导癌症基因组学研究的下一阶段。

可用性与实施

C++ 源代码可在https://medschool.vanderbilt.edu/cgg/ 免费获取。

联系方式

hai.yang@vanderbilt.edubingshan.li@Vanderbilt.Edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

引用本文的文献

1
A clustering approach to integrative analyses of multiomic cancer data.一种用于多组学癌症数据综合分析的聚类方法。
J Appl Stat. 2024 Nov 29;52(8):1539-1560. doi: 10.1080/02664763.2024.2431742. eCollection 2025.
8
An Effective Graph Clustering Method to Identify Cancer Driver Modules.一种用于识别癌症驱动模块的有效图聚类方法。
Front Bioeng Biotechnol. 2020 Apr 7;8:271. doi: 10.3389/fbioe.2020.00271. eCollection 2020.

本文引用的文献

10
Target inference from collections of genomic intervals.从基因组区间集合中进行目标推断。
Proc Natl Acad Sci U S A. 2013 Jun 18;110(25):E2271-8. doi: 10.1073/pnas.1306909110. Epub 2013 Jun 6.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验