• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从 25 年的通路图中提取的通路信息。

Pathway information extracted from 25 years of pathway figures.

机构信息

Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.

Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands.

出版信息

Genome Biol. 2020 Nov 9;21(1):273. doi: 10.1186/s13059-020-02181-2.

DOI:10.1186/s13059-020-02181-2
PMID:33168034
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7649569/
Abstract

Thousands of pathway diagrams are published each year as static figures inaccessible to computational queries and analyses. Using a combination of machine learning, optical character recognition, and manual curation, we identified 64,643 pathway figures published between 1995 and 2019 and extracted 1,112,551 instances of human genes, comprising 13,464 unique NCBI genes, participating in a wide variety of biological processes. This collection represents an order of magnitude more genes than found in the text of the same papers, and thousands of genes missing from other pathway databases, thus presenting new opportunities for discovery and research.

摘要

每年都会发表数千张路径图,但这些图都是静态的,无法进行计算查询和分析。我们结合使用机器学习、光学字符识别和人工编辑,从 1995 年至 2019 年发表的路径图中确定了 64643 张,并从中提取了 1112551 个人类基因实例,包含 13464 个独特的 NCBI 基因,参与了各种各样的生物过程。与同一批论文的文本相比,该数据集包含的基因数量多了一个数量级,而且还包含了其他通路数据库中缺失的数千个基因,因此为发现和研究提供了新的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/b54e1f18ab97/13059_2020_2181_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/22b5ba30471d/13059_2020_2181_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/347039c65367/13059_2020_2181_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/ef1d88653904/13059_2020_2181_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/a3c104d31a59/13059_2020_2181_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/b54e1f18ab97/13059_2020_2181_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/22b5ba30471d/13059_2020_2181_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/347039c65367/13059_2020_2181_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/ef1d88653904/13059_2020_2181_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/a3c104d31a59/13059_2020_2181_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a948/7650200/b54e1f18ab97/13059_2020_2181_Fig5_HTML.jpg

相似文献

1
Pathway information extracted from 25 years of pathway figures.从 25 年的通路图中提取的通路信息。
Genome Biol. 2020 Nov 9;21(1):273. doi: 10.1186/s13059-020-02181-2.
2
pathCLIP: Detection of Genes and Gene Relations From Biological Pathway Figures Through Image-Text Contrastive Learning.pathCLIP:通过图像-文本对比学习从生物途径图中检测基因和基因关系。
IEEE J Biomed Health Inform. 2024 Aug;28(8):5007-5019. doi: 10.1109/JBHI.2024.3383610. Epub 2024 Aug 6.
3
Using published pathway figures in enrichment analysis and machine learning.在富集分析和机器学习中使用已发表的通路图。
BMC Genomics. 2023 Nov 25;24(1):713. doi: 10.1186/s12864-023-09816-1.
4
WikiPathways: building research communities on biological pathways.WikiPathways:构建生物途径研究社区。
Nucleic Acids Res. 2012 Jan;40(Database issue):D1301-7. doi: 10.1093/nar/gkr1074. Epub 2011 Nov 16.
5
Browsing metabolic and regulatory networks with BioCyc.使用BioCyc浏览代谢和调控网络。
Methods Mol Biol. 2012;804:197-216. doi: 10.1007/978-1-61779-361-5_11.
6
Using Published Pathway Figures in Enrichment Analysis and Machine Learning.在富集分析和机器学习中使用已发表的通路图。
bioRxiv. 2023 Jul 12:2023.07.06.548037. doi: 10.1101/2023.07.06.548037.
7
Towards pathway curation through literature mining--a case study using PharmGKB.通过文献挖掘进行通路编目——以PharmGKB为例的案例研究
Pac Symp Biocomput. 2014:352-63.
8
Next-Generation Machine Learning for Biological Networks.下一代生物网络机器学习。
Cell. 2018 Jun 14;173(7):1581-1592. doi: 10.1016/j.cell.2018.05.015. Epub 2018 Jun 7.
9
PathwayBooster: a tool to support the curation of metabolic pathways.PathwayBooster:一种支持代谢途径编目的工具。
BMC Bioinformatics. 2015 Mar 15;16(1):86. doi: 10.1186/s12859-014-0447-2.
10
pathCLIP: Detection of Genes and Gene Relations from Biological Pathway Figures through Image-Text Contrastive Learning.pathCLIP:通过图像-文本对比学习从生物通路图中检测基因和基因关系。
bioRxiv. 2023 Nov 2:2023.10.31.564859. doi: 10.1101/2023.10.31.564859.

引用本文的文献

1
A pathway-informed disease-related gene identification approach and its application to screen novel risk genes for Alzheimer's disease.一种基于通路信息的疾病相关基因鉴定方法及其在筛选阿尔茨海默病新风险基因中的应用。
J Alzheimers Dis Rep. 2025 May 21;9:25424823251343812. doi: 10.1177/25424823251343812. eCollection 2025 Jan-Dec.
2
GeneSetCart: assembling, augmenting, combining, visualizing, and analyzing gene sets.基因集购物车:组装、扩充、合并、可视化和分析基因集。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf025.
3
Product Manifold Representations for Learning on Biological Pathways.

本文引用的文献

1
COVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms.COVID-19 疾病图谱,构建 SARS-CoV-2 病毒-宿主相互作用机制的计算知识库。
Sci Data. 2020 May 5;7(1):136. doi: 10.1038/s41597-020-0477-8.
2
Wikidata as a knowledge graph for the life sciences.Wikidata 作为生命科学的知识图谱。
Elife. 2020 Mar 17;9:e52614. doi: 10.7554/eLife.52614.
3
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
用于生物途径学习的产物流形表示
ArXiv. 2025 Feb 4:arXiv:2401.15478v2.
4
In silico and functional analysis identifies key gene networks and novel gene candidates in obesity-linked human visceral fat.计算机模拟和功能分析确定了肥胖相关人类内脏脂肪中的关键基因网络和新的候选基因。
Obesity (Silver Spring). 2024 Nov;32(11):1998-2011. doi: 10.1002/oby.24161.
5
Prediction of immunotherapy response using mutations to cancer protein assemblies.利用癌症蛋白组装体的突变预测免疫疗法反应。
Sci Adv. 2024 Sep 20;10(38):eado9746. doi: 10.1126/sciadv.ado9746.
6
WebGestalt 2024: faster gene set analysis and new support for metabolomics and multi-omics.WebGestalt 2024:更快的基因集分析以及对代谢组学和多组学的新支持。
Nucleic Acids Res. 2024 Jul 5;52(W1):W415-W421. doi: 10.1093/nar/gkae456.
7
Rummagene: massive mining of gene sets from supporting materials of biomedical research publications.Rummagene:从生物医学研究出版物的支持材料中大规模挖掘基因集。
Commun Biol. 2024 Apr 20;7(1):482. doi: 10.1038/s42003-024-06177-7.
8
pathCLIP: Detection of Genes and Gene Relations From Biological Pathway Figures Through Image-Text Contrastive Learning.pathCLIP:通过图像-文本对比学习从生物途径图中检测基因和基因关系。
IEEE J Biomed Health Inform. 2024 Aug;28(8):5007-5019. doi: 10.1109/JBHI.2024.3383610. Epub 2024 Aug 6.
9
Using published pathway figures in enrichment analysis and machine learning.在富集分析和机器学习中使用已发表的通路图。
BMC Genomics. 2023 Nov 25;24(1):713. doi: 10.1186/s12864-023-09816-1.
10
pathCLIP: Detection of Genes and Gene Relations from Biological Pathway Figures through Image-Text Contrastive Learning.pathCLIP:通过图像-文本对比学习从生物通路图中检测基因和基因关系。
bioRxiv. 2023 Nov 2:2023.10.31.564859. doi: 10.1101/2023.10.31.564859.
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
4
The reactome pathway knowledgebase.Reactome 通路知识库。
Nucleic Acids Res. 2020 Jan 8;48(D1):D498-D503. doi: 10.1093/nar/gkz1031.
5
Pathway Commons 2019 Update: integration, analysis and exploration of pathway data.Pathway Commons 2019 更新:途径数据的整合、分析和探索。
Nucleic Acids Res. 2020 Jan 8;48(D1):D489-D497. doi: 10.1093/nar/gkz946.
6
Identifying significantly impacted pathways: a comprehensive review and assessment.识别受显著影响的途径:全面回顾与评估。
Genome Biol. 2019 Oct 9;20(1):203. doi: 10.1186/s13059-019-1790-4.
7
Systems Biology Graphical Notation: Process Description language Level 1 Version 2.0.系统生物学图形符号:过程描述语言第1级版本2.0。
J Integr Bioinform. 2019 Jun 13;16(2):20190022. doi: 10.1515/jib-2019-0022.
8
PubTator central: automated concept annotation for biomedical full text articles.PubTator 中心:用于生物医学全文文章的自动概念标注。
Nucleic Acids Res. 2019 Jul 2;47(W1):W587-W593. doi: 10.1093/nar/gkz389.
9
Knowledge-based biomedical Data Science.基于知识的生物医学数据科学
EPJ Data Sci. 2017;1(1-2):19-25. doi: 10.3233/DS-170001. Epub 2017 Dec 8.
10
Signalling maps in cancer research: construction and data analysis.癌症研究中的信号转导图谱:构建与数据分析。
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay036.