NetGen：一种用于基因集功能富集分析的基于网络的新型概率生成模型。

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

作者信息

Sun Duanchen, Liu Yinliang, Zhang Xiang-Sun, Wu Ling-Yun

机构信息

Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.

National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China.

出版信息

BMC Syst Biol. 2017 Sep 21;11(Suppl 4):75. doi: 10.1186/s12918-017-0456-7.

DOI:10.1186/s12918-017-0456-7

PMID:28950861

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5615262/

Abstract

BACKGROUND

High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes.

RESULTS

In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ).

CONCLUSION

Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

摘要

背景

在过去几十年中，高通量实验技术得到了显著改进并被广泛应用。然而，对高通量实验结果进行生物学解释，例如从微阵列或RNA测序实验中获得的差异表达基因集，仍然是一项具有挑战性的任务。基因本体论（GO）常用于功能富集研究。通过当前功能富集分析工具识别的GO术语在GO层次结构中通常包含直接的父术语或子术语。高度冗余的术语使得用户难以分析潜在的生物学过程。

结果

本文提出了一种基于网络的新型概率生成模型NetGen来进行功能富集分析。明确使用了一个额外的蛋白质-蛋白质相互作用（PPI）网络来辅助识别显著富集的GO术语。在模拟研究中，NetGen比现有方法表现更优。在四个真实数据集上进一步探究了NetGen的有效性。值得注意的是，识别出了几个与每种疾病的活跃基因列表没有直接关联的GO术语。当查阅经过整理的文献时，这些术语与相应疾病密切相关。NetGen已在R包CopTea中实现，可在GitHub（http://github.com/wulingyun/CopTea/）上公开获取。

结论

我们的方法导致功能富集分析的结果更合理且更具可解释性。作为一种基于新型术语组合的功能富集分析方法，NetGen是对当前基于单个术语的方法的补充，并且有助于探索复杂疾病的潜在发病机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3486/5615262/52558f6ed05e/12918_2017_456_Fig1_HTML.jpg

相似文献

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

BMC Syst Biol. 2017 Sep 21;11(Suppl 4):75. doi: 10.1186/s12918-017-0456-7.

CEA: Combination-based gene set functional enrichment analysis.

Sci Rep. 2018 Aug 30;8(1):13085. doi: 10.1038/s41598-018-31396-4.

NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases.

BMC Genomics. 2015;16 Suppl 8(Suppl 8):S6. doi: 10.1186/1471-2164-16-S8-S6. Epub 2015 Jun 18.

GOMA: functional enrichment analysis tool based on GO modules.

Chin J Cancer. 2013 Apr;32(4):195-204. doi: 10.5732/cjc.012.10151. Epub 2012 Dec 7.

GOATOOLS: A Python library for Gene Ontology analyses.

Sci Rep. 2018 Jul 18;8(1):10872. doi: 10.1038/s41598-018-28948-z.

Gogadget: An R Package for Interpretation and Visualization of GO Enrichment Results.

Mol Inform. 2017 May;36(5-6). doi: 10.1002/minf.201600132. Epub 2016 Dec 21.

GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists.

BMC Bioinformatics. 2009 Feb 3;10:48. doi: 10.1186/1471-2105-10-48.

GOAT: efficient and robust identification of gene set enrichment.

Commun Biol. 2024 Jun 19;7(1):744. doi: 10.1038/s42003-024-06454-5.

How to decide which are the most pertinent overly-represented features during gene set enrichment analysis.

BMC Bioinformatics. 2007 Sep 11;8:332. doi: 10.1186/1471-2105-8-332.

Comparing gene annotation enrichment tools for functional modeling of agricultural microarray data.

BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S9. doi: 10.1186/1471-2105-10-S11-S9.

引用本文的文献

Longitudinal pathway analysis using structural information with case studies in early type 1 diabetes.

Sci Rep. 2025 May 2;15(1):15393. doi: 10.1038/s41598-025-98492-0.

CEA: Combination-based gene set functional enrichment analysis.

Sci Rep. 2018 Aug 30;8(1):13085. doi: 10.1038/s41598-018-31396-4.

本文引用的文献

Voltage-gated ion channels in cancer cell proliferation.

Cancers (Basel). 2015 May 22;7(2):849-75. doi: 10.3390/cancers7020813.

Role of ion channels in regulating Ca²⁺ homeostasis during the interplay between immune and cancer cells.

Cell Death Dis. 2015 Feb 19;6(2):e1648. doi: 10.1038/cddis.2015.23.

Role of tRNA modifications in human diseases.

Trends Mol Med. 2014 Jun;20(6):306-14. doi: 10.1016/j.molmed.2014.01.008. Epub 2014 Feb 25.

pH sensing and regulation in cancer.

Front Physiol. 2013 Dec 17;4:370. doi: 10.3389/fphys.2013.00370.

Exosomes in cancer development, metastasis, and drug resistance: a comprehensive review.

Cancer Metastasis Rev. 2013 Dec;32(3-4):623-42. doi: 10.1007/s10555-013-9441-9.

Thiolated chitosan-modified PLA-PCL-TPGS nanoparticles for oral chemotherapy of lung cancer.

Nanoscale Res Lett. 2013 Feb 9;8(1):66. doi: 10.1186/1556-276X-8-66.

Network enrichment analysis: extension of gene-set enrichment analysis to gene networks.

BMC Bioinformatics. 2012 Sep 11;13:226. doi: 10.1186/1471-2105-13-226.

EnrichNet: network-based gene set enrichment analysis.

Bioinformatics. 2012 Sep 15;28(18):i451-i457. doi: 10.1093/bioinformatics/bts389.

Markov Chain Ontology Analysis (MCOA).

BMC Bioinformatics. 2012 Feb 3;13:23. doi: 10.1186/1471-2105-13-23.

Molecular mechanism and physiological functions of clathrin-mediated endocytosis.

Nat Rev Mol Cell Biol. 2011 Jul 22;12(8):517-33. doi: 10.1038/nrm3151.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

NetGen：一种用于基因集功能富集分析的基于网络的新型概率生成模型。

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

作者信息

Sun Duanchen, Liu Yinliang, Zhang Xiang-Sun, Wu Ling-Yun

机构信息

Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China.

National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing, 100190, China.