基于下一代测序数据的假定癌症基因的个性化通路富集图谱。

Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.

机构信息

Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America.

出版信息

PLoS One. 2012;7(5):e37595. doi: 10.1371/journal.pone.0037595. Epub 2012 May 18.

DOI:10.1371/journal.pone.0037595

PMID:22624051

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3356304/

Abstract

BACKGROUND

Pathway analysis of a set of genes represents an important area in large-scale omic data analysis. However, the application of traditional pathway enrichment methods to next-generation sequencing (NGS) data is prone to several potential biases, including genomic/genetic factors (e.g., the particular disease and gene length) and environmental factors (e.g., personal life-style and frequency and dosage of exposure to mutagens). Therefore, novel methods are urgently needed for these new data types, especially for individual-specific genome data.

METHODOLOGY

In this study, we proposed a novel method for the pathway analysis of NGS mutation data by explicitly taking into account the gene-wise mutation rate. We estimated the gene-wise mutation rate based on the individual-specific background mutation rate along with the gene length. Taking the mutation rate as a weight for each gene, our weighted resampling strategy builds the null distribution for each pathway while matching the gene length patterns. The empirical P value obtained then provides an adjusted statistical evaluation.

PRINCIPAL FINDINGS/CONCLUSIONS: We demonstrated our weighted resampling method to a lung adenocarcinomas dataset and a glioblastoma dataset, and compared it to other widely applied methods. By explicitly adjusting gene-length, the weighted resampling method performs as well as the standard methods for significant pathways with strong evidence. Importantly, our method could effectively reject many marginally significant pathways detected by standard methods, including several long-gene-based, cancer-unrelated pathways. We further demonstrated that by reducing such biases, pathway crosstalk for each individual and pathway co-mutation map across multiple individuals can be objectively explored and evaluated. This method performs pathway analysis in a sample-centered fashion, and provides an alternative way for accurate analysis of cancer-personalized genomes. It can be extended to other types of genomic data (genotyping and methylation) that have similar bias problems.

摘要

背景

一组基因的通路分析是大规模组学数据分析中的一个重要领域。然而，传统的通路富集方法在应用于下一代测序（NGS）数据时容易受到几个潜在的偏差的影响，包括基因组/遗传因素（例如，特定疾病和基因长度）和环境因素（例如，个人生活方式和接触诱变剂的频率和剂量）。因此，这些新的数据类型，特别是针对个体特定的基因组数据，迫切需要新的方法。

方法

在这项研究中，我们提出了一种新的方法，通过明确考虑基因的突变率，对 NGS 突变数据进行通路分析。我们基于个体特定的背景突变率以及基因长度来估计基因的突变率。通过将突变率作为每个基因的权重，我们的加权重抽样策略在匹配基因长度模式的同时，为每个通路构建了零分布。然后，通过获得的经验 P 值提供了调整后的统计评估。

主要发现/结论：我们将加权重抽样方法应用于肺腺癌数据集和胶质母细胞瘤数据集，并将其与其他广泛应用的方法进行了比较。通过明确调整基因长度，加权重抽样方法在具有强证据的显著通路中表现得与标准方法一样好。重要的是，我们的方法可以有效地拒绝标准方法检测到的许多边缘显著通路，包括几个基于长基因的、与癌症无关的通路。我们进一步证明，通过减少这种偏差，可以客观地探索和评估每个个体的通路串扰和多个个体的通路共突变图谱。该方法以样本为中心进行通路分析，为准确分析癌症个体化基因组提供了一种替代方法。它可以扩展到具有类似偏差问题的其他类型的基因组数据（基因分型和甲基化）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e51c/3356304/fa4a0b3ac61b/pone.0037595.g001.jpg

相似文献

Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.

PLoS One. 2012;7(5):e37595. doi: 10.1371/journal.pone.0037595. Epub 2012 May 18.

LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):467. doi: 10.1186/s12859-016-1332-y.

Clinical Applications of Next-Generation Sequencing in Cancer Diagnosis.

Pathol Oncol Res. 2017 Apr;23(2):225-234. doi: 10.1007/s12253-016-0124-z. Epub 2016 Oct 8.

Frequency of mutations in individuals with breast cancer referred for BRCA1 and BRCA2 testing using next-generation sequencing with a 25-gene panel.

Cancer. 2015 Jan 1;121(1):25-33. doi: 10.1002/cncr.29010. Epub 2014 Sep 3.

Systematic review of next-generation sequencing simulators: computational tools, features and perspectives.

Brief Funct Genomics. 2017 May 1;16(3):121-128. doi: 10.1093/bfgp/elw012.

Mutational landscape of gastric cancer and clinical application of genomic profiling based on target next-generation sequencing.

J Transl Med. 2019 Jun 4;17(1):189. doi: 10.1186/s12967-019-1941-0.

Towards a Next-Generation Sequencing Diagnostic Service for Tumour Genotyping: A Comparison of Panels and Platforms.

Biomed Res Int. 2015;2015:478017. doi: 10.1155/2015/478017. Epub 2015 Aug 17.

Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money.

J Microbiol Methods. 2017 Jul;138:60-71. doi: 10.1016/j.mimet.2016.02.016. Epub 2016 Mar 16.

Identifying and mitigating bias in next-generation sequencing methods for chromatin biology.

Nat Rev Genet. 2014 Nov;15(11):709-21. doi: 10.1038/nrg3788. Epub 2014 Sep 16.

Genome-Wide Copy Number Variation Detection Using NGS: Data Analysis and Interpretation.

Methods Mol Biol. 2019;1908:113-124. doi: 10.1007/978-1-4939-9004-7_8.

引用本文的文献

An integrative study of genetic variants with brain tissue expression identifies viral etiology and potential drug targets of multiple sclerosis.

Mol Cell Neurosci. 2021 Sep;115:103656. doi: 10.1016/j.mcn.2021.103656. Epub 2021 Jul 17.

A developmental stage-specific network approach for studying dynamic co-regulation of transcription factors and microRNAs during craniofacial development.

Development. 2020 Dec 24;147(24):dev192948. doi: 10.1242/dev.192948.

An integrative, genomic, transcriptomic and network-assisted study to identify genes associated with human cleft lip with or without cleft palate.

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):39. doi: 10.1186/s12920-020-0675-4.

Infrastructures of systems biology that facilitate functional genomic study in rice.

Rice (N Y). 2019 Mar 14;12(1):15. doi: 10.1186/s12284-019-0276-z.

Practical aspects of NGS-based pathways analysis for personalized cancer science and medicine.

Oncotarget. 2016 Aug 9;7(32):52493-52516. doi: 10.18632/oncotarget.9370.

Advances in computational approaches for prioritizing driver mutations and significantly mutated genes in cancer genomes.

Brief Bioinform. 2016 Jul;17(4):642-56. doi: 10.1093/bib/bbv068. Epub 2015 Aug 24.

Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

Front Psychol. 2014 Mar 24;5:247. doi: 10.3389/fpsyg.2014.00247. eCollection 2014.

本文引用的文献

NGS catalog: A database of next generation sequencing studies in humans.

Hum Mutat. 2012 Jun;33(6):E2341-55. doi: 10.1002/humu.22096. Epub 2012 Apr 19.

Pathway analysis of genomic data: concepts, methods, and prospects for future development.

Trends Genet. 2012 Jul;28(7):323-32. doi: 10.1016/j.tig.2012.03.004. Epub 2012 Apr 3.

KEGG for integration and interpretation of large-scale molecular data sets.

Nucleic Acids Res. 2012 Jan;40(Database issue):D109-14. doi: 10.1093/nar/gkr988. Epub 2011 Nov 10.

Analysis of pathway mutation profiles highlights collaboration between cancer-associated superpathways.

Hum Mutat. 2011 Sep;32(9):1028-35. doi: 10.1002/humu.21541. Epub 2011 Jul 12.

PathScan: a tool for discerning mutational significance in groups of putative cancer genes.

Bioinformatics. 2011 Jun 15;27(12):1595-602. doi: 10.1093/bioinformatics/btr193. Epub 2011 Apr 14.

Enrichment map: a network-based method for gene-set enrichment visualization and interpretation.

PLoS One. 2010 Nov 15;5(11):e13984. doi: 10.1371/journal.pone.0013984.

Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function.

PLoS Genet. 2010 Sep 9;6(9):e1001097. doi: 10.1371/journal.pgen.1001097.

Diverse somatic mutation patterns and pathway alterations in human cancers.

Nature. 2010 Aug 12;466(7308):869-73. doi: 10.1038/nature09208. Epub 2010 Jul 28.

Functional impact of global rare copy number variation in autism spectrum disorders.

Nature. 2010 Jul 15;466(7304):368-72. doi: 10.1038/nature09146. Epub 2010 Jun 9.

The mutation spectrum revealed by paired genome sequences from a lung cancer patient.

Nature. 2010 May 27;465(7297):473-7. doi: 10.1038/nature09004.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于下一代测序数据的假定癌症基因的个性化通路富集图谱。

Personalized pathway enrichment map of putative cancer genes from next generation sequencing data.

机构信息

出版信息

BACKGROUND

METHODOLOGY

背景

方法

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献