• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GOMCL:一个用于聚类、评估和提取基于基因本体论的功能的非冗余关联的工具包。

GOMCL: a toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions.

机构信息

Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA.

出版信息

BMC Bioinformatics. 2020 Apr 10;21(1):139. doi: 10.1186/s12859-020-3447-4.

DOI:10.1186/s12859-020-3447-4
PMID:32272889
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7146957/
Abstract

BACKGROUND

Functional enrichment of genes and pathways based on Gene Ontology (GO) has been widely used to describe the results of various -omics analyses. GO terms statistically overrepresented within a set of a large number of genes are typically used to describe the main functional attributes of the gene set. However, these lists of overrepresented GO terms are often too large and contains redundant overlapping GO terms hindering informative functional interpretations.

RESULTS

We developed GOMCL to reduce redundancy and summarize lists of GO terms effectively and informatively. This lightweight python toolkit efficiently identifies clusters within a list of GO terms using the Markov Clustering (MCL) algorithm, based on the overlap of gene members between GO terms. GOMCL facilitates biological interpretation of a large number of GO terms by condensing them into GO clusters representing non-overlapping functional themes. It enables visualizing GO clusters as a heatmap, networks based on either overlap of members or hierarchy among GO terms, and tables with depth and cluster information for each GO term. Each GO cluster generated by GOMCL can be evaluated and further divided into non-overlapping sub-clusters using the GOMCL-sub module. The outputs from both GOMCL and GOMCL-sub can be imported to Cytoscape for additional visualization effects.

CONCLUSIONS

GOMCL is a convenient toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions. GOMCL helps researchers to reduce time spent on manual curation of large lists of GO terms, minimize biases introduced by redundant GO terms in data interpretation, and batch processing of multiple GO enrichment datasets. A user guide, a test dataset, and the source code of GOMCL are available at https://github.com/Guannan-Wang/GOMCL and www.lsugenomics.org.

摘要

背景

基于基因本体论(GO)的基因和途径功能富集已广泛用于描述各种组学分析的结果。在大量基因集中,统计上过度代表的 GO 术语通常用于描述基因集的主要功能属性。然而,这些过度代表的 GO 术语列表通常太大,包含冗余重叠的 GO 术语,从而阻碍了信息丰富的功能解释。

结果

我们开发了 GOMCL 来有效且有信息量地减少冗余并总结 GO 术语列表。这个轻量级的 Python 工具包使用基于基因成员在 GO 术语之间的重叠的 Markov 聚类(MCL)算法,有效地识别 GO 术语列表中的簇。GOMCL 通过将它们压缩成代表非重叠功能主题的 GO 簇,促进了大量 GO 术语的生物学解释。它可以将 GO 簇可视化作为热图,基于成员重叠或 GO 术语之间的层次结构的网络,以及每个 GO 术语的深度和簇信息的表。通过 GOMCL-sub 模块,可以对 GOMCL 生成的每个 GO 簇进行评估,并进一步将其划分为非重叠子簇。GOMCL 和 GOMCL-sub 的输出都可以导入 Cytoscape 以获得额外的可视化效果。

结论

GOMCL 是一个方便的工具包,用于聚类、评估和提取基于基因本体论的功能的非冗余关联。GOMCL 帮助研究人员减少手动整理大量 GO 术语列表的时间,最大限度地减少数据解释中冗余 GO 术语引入的偏差,并批量处理多个 GO 富集数据集。用户指南、测试数据集和 GOMCL 的源代码可在 https://github.com/Guannan-Wang/GOMCL 和 www.lsugenomics.org 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/5a7c9aa3ac2c/12859_2020_3447_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/75e384c61dc1/12859_2020_3447_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/cee45413aadd/12859_2020_3447_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/fe159e2bb8f1/12859_2020_3447_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/6842af634034/12859_2020_3447_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/5a7c9aa3ac2c/12859_2020_3447_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/75e384c61dc1/12859_2020_3447_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/cee45413aadd/12859_2020_3447_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/fe159e2bb8f1/12859_2020_3447_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/6842af634034/12859_2020_3447_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff10/7146957/5a7c9aa3ac2c/12859_2020_3447_Fig5_HTML.jpg

相似文献

1
GOMCL: a toolkit to cluster, evaluate, and extract non-redundant associations of Gene Ontology-based functions.GOMCL:一个用于聚类、评估和提取基于基因本体论的功能的非冗余关联的工具包。
BMC Bioinformatics. 2020 Apr 10;21(1):139. doi: 10.1186/s12859-020-3447-4.
2
Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing framework.在多重假设检验框架下对多个基因簇进行基因本体分析。
Artif Intell Med. 2007 Oct;41(2):105-15. doi: 10.1016/j.artmed.2007.08.002.
3
GOTrapper: a tool to navigate through branches of gene ontology hierarchy.GOTrapper:一种用于浏览基因本体论层次结构分支的工具。
BMC Bioinformatics. 2019 Jan 11;20(1):20. doi: 10.1186/s12859-018-2581-8.
4
NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology.NaviGO:用于基因本体可视化以及功能相似性和连贯性分析的交互式工具。
BMC Bioinformatics. 2017 Mar 20;18(1):177. doi: 10.1186/s12859-017-1600-5.
5
Gogadget: An R Package for Interpretation and Visualization of GO Enrichment Results.Gogadget:用于解释和可视化 GO 富集结果的 R 包。
Mol Inform. 2017 May;36(5-6). doi: 10.1002/minf.201600132. Epub 2016 Dec 21.
6
Summary Visualizations of Gene Ontology Terms With GO-Figure!使用GO-Figure对基因本体术语进行总结可视化!
Front Bioinform. 2021 Apr 1;1:638255. doi: 10.3389/fbinf.2021.638255. eCollection 2021.
7
goSTAG: gene ontology subtrees to tag and annotate genes within a set.goSTAG:用于标记和注释一组基因的基因本体子树。
Source Code Biol Med. 2017 Apr 13;12:6. doi: 10.1186/s13029-017-0066-1. eCollection 2017.
8
Visualizing GO Annotations.可视化基因本体注释
Methods Mol Biol. 2017;1446:207-220. doi: 10.1007/978-1-4939-3743-1_15.
9
FunSet: an open-source software and web server for performing and displaying Gene Ontology enrichment analysis.FunSet:一个用于执行和展示基因本体论富集分析的开源软件和网络服务器。
BMC Bioinformatics. 2019 Jun 27;20(1):359. doi: 10.1186/s12859-019-2960-9.
10
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

引用本文的文献

1
Pathway Analysis Interpretation in the Multi-Omic Era.多组学时代的通路分析解读
BioTech (Basel). 2025 Jul 29;14(3):58. doi: 10.3390/biotech14030058.
2
GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis.基因集聚类2.0:一个用于总结和整合基因集分析的综合工具集。
BMC Bioinformatics. 2025 Aug 21;26(1):219. doi: 10.1186/s12859-025-06249-3.
3
Cluefish: mining the dark matter of transcriptional data series with over-representation analysis enhanced by aggregated biological prior knowledge.

本文引用的文献

1
Low-Phosphate Chromatin Dynamics Predict a Cell Wall Remodeling Network in Rice Shoots.低磷染色质动力学预测水稻芽中的细胞壁重塑网络。
Plant Physiol. 2020 Mar;182(3):1494-1509. doi: 10.1104/pp.19.01153. Epub 2019 Dec 19.
2
g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update).g:Profiler:一个用于功能富集分析和基因列表转换的网络服务器(2019 更新)。
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198. doi: 10.1093/nar/gkz369.
3
Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap.
线索鱼:利用聚合生物学先验知识增强的过度表达分析挖掘转录数据系列的暗物质。
NAR Genom Bioinform. 2025 Jul 30;7(3):lqaf103. doi: 10.1093/nargab/lqaf103. eCollection 2025 Sep.
4
DNA methylome biomarkers of rheumatoid arthritis-associated interstitial lung disease reflecting lung fibrosis pathways, an exploratory case-control study.反映肺纤维化途径的类风湿关节炎相关间质性肺疾病的DNA甲基化组生物标志物:一项探索性病例对照研究
Sci Rep. 2025 Apr 29;15(1):15123. doi: 10.1038/s41598-025-99755-6.
5
Interpreting and visualizing pathway analyses using embedding representations with PAVER.使用PAVER的嵌入表示法解释和可视化通路分析。
Bioinformation. 2024 Jul 31;20(7):700-704. doi: 10.6026/973206300200700. eCollection 2024.
6
Alfalfa Responses to Intensive Soil Compaction: Effects on Plant and Root Growth, Phytohormones and Internal Gene Expression.紫花苜蓿对土壤重度压实的响应:对植株和根系生长、植物激素及内部基因表达的影响
Plants (Basel). 2024 Mar 26;13(7):953. doi: 10.3390/plants13070953.
7
Comparative transcriptomics of the chilling stress response in two Asian mangrove species, Bruguiera gymnorhiza and Rhizophora apiculata.两种亚洲红树植物(木榄和角果木)在冷胁迫响应中的比较转录组学研究。
Tree Physiol. 2024 Feb 11;44(3). doi: 10.1093/treephys/tpae019.
8
vissE: a versatile tool to identify and visualise higher-order molecular phenotypes from functional enrichment analysis.vissE:一种通用的工具,可从功能富集分析中识别和可视化高阶分子表型。
BMC Bioinformatics. 2024 Feb 8;25(1):64. doi: 10.1186/s12859-024-05676-y.
9
Integrative analysis of noncoding mutations identifies the druggable genome in preterm birth.非编码突变的综合分析确定了早产的可用药基因组。
Sci Adv. 2024 Jan 19;10(3):eadk1057. doi: 10.1126/sciadv.adk1057.
10
Delving into gene-set multiplex networks facilitated by a k-nearest neighbor-based measure of similarity.深入研究由基于k近邻相似度度量所推动的基因集多重网络。
Comput Struct Biotechnol J. 2023 Oct 11;21:4988-5002. doi: 10.1016/j.csbj.2023.09.042. eCollection 2023.
使用 g:Profiler、GSEA、Cytoscape 和 EnrichmentMap 进行组学数据的通路富集分析和可视化。
Nat Protoc. 2019 Feb;14(2):482-517. doi: 10.1038/s41596-018-0103-9.
4
The Gene Ontology Resource: 20 years and still GOing strong.《基因本体论资源:20 年,持续强大》
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. doi: 10.1093/nar/gky1055.
5
GOATOOLS: A Python library for Gene Ontology analyses.GOATOOLS:一个用于基因本体论分析的 Python 库。
Sci Rep. 2018 Jul 18;8(1):10872. doi: 10.1038/s41598-018-28948-z.
6
The Reactome Pathway Knowledgebase.Reactome 通路知识库。
Nucleic Acids Res. 2018 Jan 4;46(D1):D649-D655. doi: 10.1093/nar/gkx1132.
7
Framework for gradual progression of cell ontogeny in the root meristem.根分生组织中细胞个体发生的渐进式框架。
Proc Natl Acad Sci U S A. 2017 Oct 17;114(42):E8922-E8929. doi: 10.1073/pnas.1707400114. Epub 2017 Oct 2.
8
agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update.agriGO v2.0:农业社区的 GO 分析工具包,2017 年更新。
Nucleic Acids Res. 2017 Jul 3;45(W1):W122-W129. doi: 10.1093/nar/gkx382.
9
KEGG: new perspectives on genomes, pathways, diseases and drugs.京都基因与基因组百科全书(KEGG):关于基因组、通路、疾病和药物的新视角。
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. doi: 10.1093/nar/gkw1092. Epub 2016 Nov 28.
10
Impact of outdated gene annotations on pathway enrichment analysis.过时的基因注释对通路富集分析的影响。
Nat Methods. 2016 Aug 30;13(9):705-6. doi: 10.1038/nmeth.3963.