Suppr超能文献

基因集购物车:组装、扩充、合并、可视化和分析基因集。

GeneSetCart: assembling, augmenting, combining, visualizing, and analyzing gene sets.

作者信息

Marino Giacomo B, Olaiya Stephanie, Evangelista John Erol, Clarke Daniel J B, Ma'ayan Avi

机构信息

Mount Sinai Center for Bioinformatics, Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.

出版信息

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf025.

Abstract

Converting multiomics datasets into gene sets facilitates data integration that leads to knowledge discovery. Although there are tools developed to analyze gene sets, only a few offer the management of gene sets from multiple sources. GeneSetCart is an interactive web-based platform that enables investigators to gather gene sets from various sources; augment these sets with gene-gene coexpression correlations and protein-protein interactions; perform set operations on these sets such as union, consensus, and intersection; and visualize and analyze these gene sets, all in one place. GeneSetCart supports the upload of single or multiple gene sets, as well as fetching gene sets by searching PubMed for genes comentioned with terms in publications. Venn diagrams, heatmaps, Uniform Manifold Approximation and Projection (UMAP) plots, SuperVenn diagrams, and UpSet plots can visualize the gene sets in a GeneSetCart session to summarize the similarity and overlap among the sets. Users of GeneSetCart can also perform enrichment analysis on their assembled gene sets with external tools. All gene sets in a session can be saved to a user account for reanalysis and sharing with collaborators. GeneSetCart has a gene set library crossing feature that enables analysis of gene sets created from several National Institutes of Health Common Fund programs. For the top overlapping sets from pairs of programs, a large language model (LLM) is prompted to propose possible reasons for the high overlap. Using this feature, two use cases are presented. In addition, users of GeneSetCart can produce publication-ready reports from their uploaded sets. Text in these reports is also supplemented with an LLM. Overall, GeneSetCart is a useful resource enabling biologists without programming expertise to facilitate data integration for hypothesis generation.

摘要

将多组学数据集转换为基因集有助于实现数据整合,进而推动知识发现。尽管已经开发了一些用于分析基因集的工具,但只有少数工具能够管理来自多个来源的基因集。GeneSetCart是一个基于网络的交互式平台,使研究人员能够从各种来源收集基因集;利用基因-基因共表达相关性和蛋白质-蛋白质相互作用来扩充这些基因集;对这些基因集进行并集、共识集和交集等集合运算;并在一个地方可视化和分析这些基因集。GeneSetCart支持上传单个或多个基因集,以及通过在PubMed中搜索出版物中与术语共同提及的基因来获取基因集。维恩图、热图、均匀流形近似和投影(UMAP)图、超级维恩图和UpSet图可以在GeneSetCart会话中可视化基因集,以总结各集合之间的相似性和重叠情况。GeneSetCart的用户还可以使用外部工具对其组装的基因集进行富集分析。会话中的所有基因集都可以保存到用户账户中,以便重新分析并与合作者共享。GeneSetCart具有基因集库交叉功能,能够分析由美国国立卫生研究院多个共同基金项目创建的基因集。对于成对项目中重叠度最高的集合,会促使一个大语言模型(LLM)提出高重叠度的可能原因。利用此功能,展示了两个用例。此外,GeneSetCart的用户可以根据其上传的基因集生成可供发表的报告。这些报告中的文本也由一个大语言模型进行补充。总体而言,GeneSetCart是一个有用的资源,使没有编程专业知识的生物学家能够促进数据整合以生成假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea7/11984350/5078c575bb10/giaf025fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验