Suppr超能文献

Rummagene:从生物医学研究出版物的支持材料中大规模挖掘基因集。

Rummagene: massive mining of gene sets from supporting materials of biomedical research publications.

机构信息

Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.

出版信息

Commun Biol. 2024 Apr 20;7(1):482. doi: 10.1038/s42003-024-06177-7.

Abstract

Many biomedical research publications contain gene sets in their supporting tables, and these sets are currently not available for search and reuse. By crawling PubMed Central, the Rummagene server provides access to hundreds of thousands of such mammalian gene sets. So far, we scanned 5,448,589 articles to find 121,237 articles that contain 642,389 gene sets. These sets are served for enrichment analysis, free text, and table title search. Investigating statistical patterns within the Rummagene database, we demonstrate that Rummagene can be used for transcription factor and kinase enrichment analyses, and for gene function predictions. By combining gene set similarity with abstract similarity, Rummagene can find surprising relationships between biological processes, concepts, and named entities. Overall, Rummagene brings to surface the ability to search a massive collection of published biomedical datasets that are currently buried and inaccessible. The Rummagene web application is available at https://rummagene.com .

摘要

许多生物医学研究出版物在其支持的表格中包含基因集,而这些集目前无法进行搜索和重用。通过爬取 PubMed Central,Rummagene 服务器提供了对数十万个此类哺乳动物基因集的访问。到目前为止,我们已经扫描了 5448589 篇文章,找到了包含 642389 个基因集的 121237 篇文章。这些集可用于富集分析、自由文本和表格标题搜索。通过研究 Rummagene 数据库中的统计模式,我们证明 Rummagene 可用于转录因子和激酶富集分析以及基因功能预测。通过将基因集相似度与摘要相似度相结合,Rummagene 可以发现生物过程、概念和命名实体之间令人惊讶的关系。总的来说,Rummagene 使得能够搜索大量目前被埋没和无法访问的已发表生物医学数据集。Rummagene 的网络应用程序可在 https://rummagene.com 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00e1/11032387/4ae9f79c796c/42003_2024_6177_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验