Suppr超能文献

TarKG:一个全面的生物医学知识图谱,用于目标发现。

TarKG: a comprehensive biomedical knowledge graph for target discovery.

机构信息

Key Laboratory of Drug-Targeting and Drug Delivery System of the Education Ministry and Sichuan Province, Department of Medicinal Chemistry, West China School of Pharmacy, Sichuan University, Chengdu 610041, China.

Division of Data Intelligence, Department of Computer Science, Shantou University, Shantou 515063, China.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae598.

Abstract

MOTIVATION

Target discovery is a crucial step in drug development, as it directly affects the success rate of clinical trials. Knowledge graphs (KGs) offer unique advantages in processing complex biological data and inferring new relationships. Existing biomedical KGs primarily focus on tasks such as drug repositioning and drug-target interactions, leaving a gap in the construction of KGs tailored for target discovery.

RESULTS

We established a comprehensive biomedical KG focusing on target discovery, termed TarKG, by integrating seven existing biomedical KGs, nine public databases, and traditional Chinese medicine knowledge databases. TarKG consists of 1 143 313 entities and 32 806 467 relations across 15 entity categories and 171 relation types, all centered around 3 core entity types: Disease, Gene, and Compound. TarKG provides specialized knowledges for the core entities including chemical structures, protein sequences, or text descriptions. By using different KG embedding algorithms, we assessed the knowledge completion capabilities of TarKG, particularly for disease-target link prediction. In case studies, we further examined TarKG's ability to predict potential protein targets for Alzheimer's disease (AD) and to identify diseases potentially associated with the metallo-deubiquitinase CSN5, using literature analysis for validation. Furthermore, we provided a user-friendly web server (https://tarkg.ddtmlab.org) that enables users to perform knowledge retrieval and relation inference using TarKG.

AVAILABILITY AND IMPLEMENTATION

TarKG is accessible at https://tarkg.ddtmlab.org.

摘要

动机

靶点发现是药物开发的关键步骤,因为它直接影响临床试验的成功率。知识图谱 (KG) 在处理复杂的生物数据和推断新的关系方面具有独特的优势。现有的生物医学知识图谱主要集中在药物重定位和药物-靶点相互作用等任务上,而在构建针对靶点发现的知识图谱方面存在空白。

结果

我们通过整合七个现有的生物医学知识图谱、九个公共数据库和中药知识数据库,建立了一个全面的生物医学知识图谱,专门用于靶点发现,称为 TarKG。TarKG 由 1143313 个实体和 32806467 个关系组成,涵盖 15 个实体类别和 171 个关系类型,所有这些都以 3 个核心实体类型为中心:疾病、基因和化合物。TarKG 为核心实体提供了专门的知识,包括化学结构、蛋白质序列或文本描述。通过使用不同的 KG 嵌入算法,我们评估了 TarKG 的知识完成能力,特别是针对疾病-靶点链接预测。在案例研究中,我们进一步检验了 TarKG 预测阿尔茨海默病(AD)潜在蛋白靶点的能力,并通过文献分析进行验证,以识别可能与金属去泛素酶 CSN5 相关的疾病。此外,我们提供了一个用户友好的网络服务器 (https://tarkg.ddtmlab.org),允许用户使用 TarKG 进行知识检索和关系推断。

可用性和实现

TarKG 可在 https://tarkg.ddtmlab.org 访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcad/11513019/efdf7d64dac6/btae598f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验