Suppr超能文献

用于从文献中提取生物学证据的Lit-OTAR框架。

Lit-OTAR framework for extracting biological evidences from literature.

作者信息

Tirunagari Santosh, Saha Shyamasree, Venkatesan Aravind, Suveges Daniel, Carmona Miguel, Buniello Annalisa, Ochoa David, McEntyre Johanna, McDonagh Ellen, Harrison Melissa

机构信息

Literature Services Team, European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom.

Open Targets, European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom.

出版信息

Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf113.

Abstract

SUMMARY

The lit-OTAR framework, developed through a collaboration between Europe PMC and Open Targets, leverages deep learning to revolutionize drug discovery by extracting evidence from scientific literature for drug target identification and validation. This novel framework combines named entity recognition for identifying gene/protein (target), disease, organism, and chemical/drug within scientific texts, and entity normalization to map these entities to databases like Ensembl, Experimental Factor Ontology, and ChEMBL. Continuously operational, it has processed over 39 million abstracts and 4.5 million full-text articles and preprints to date, identifying more than 48.5 million unique associations that significantly help accelerate the drug discovery process and scientific research >29.9 m distinct target-disease, 11.8 m distinct target-drug, and 8.3 m distinct disease-drug relationships.

AVAILABILITY AND IMPLEMENTATION

The results are accessible through Europe PMC's SciLite web app (https://europepmc.org/) and its annotations API (https://europepmc.org/annotationsapi), as well as via the Open Targets Platform (https://platform.opentargets.org/). The daily pipeline is available at https://github.com/ML4LitS/otar-maintenance, and the Open Targets ETL processes are available at https://github.com/opentargets.

摘要

摘要

lit-OTAR框架由欧洲分子生物学实验室核心(Europe PMC)与开放靶点(Open Targets)合作开发,它利用深度学习从科学文献中提取证据,用于药物靶点的识别和验证,从而彻底改变药物发现。这个新颖的框架结合了命名实体识别,用于识别科学文本中的基因/蛋白质(靶点)、疾病、生物体以及化学物质/药物,还包括实体标准化,以将这些实体映射到诸如Ensembl、实验因子本体和ChEMBL等数据库。该框架持续运行,迄今为止已处理了超过3900万篇摘要以及450万篇全文文章和预印本,识别出超过4850万个独特关联,显著有助于加速药物发现过程和科学研究,包括超过2990万个不同的靶点-疾病关系、1180万个不同的靶点-药物关系以及830万个不同的疾病-药物关系。

可用性与实施方式

结果可通过欧洲分子生物学实验室核心的SciLite网络应用程序(https://europepmc.org/)及其注释应用程序编程接口(https://europepmc.org/annotationsapi)获取,也可通过开放靶点平台(https://platform.opentargets.org/)获取。每日流程可在https://github.com/ML4LitS/otar-maintenance上获取,开放靶点ETL流程可在https://github.com/opentargets上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/664b/11978389/6502daca019e/btaf113f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验