Suppr超能文献

RDKG-115:通过三模态知识图嵌入辅助罕见病药物再利用和发现。

RDKG-115: Assisting drug repurposing and discovery for rare diseases by trimodal knowledge graph embedding.

机构信息

Intelligent Medicine Institute, Shanghai Medical College, Fudan University, Shanghai, 200032, China.

College of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.

出版信息

Comput Biol Med. 2023 Sep;164:107262. doi: 10.1016/j.compbiomed.2023.107262. Epub 2023 Jul 17.

Abstract

Rare diseases (RDs) may affect individuals in small numbers, but they have a significant impact on a global scale. Accurate diagnosis of RDs is challenging, and there is a severe lack of drugs available for treatment. Pharmaceutical companies have shown a preference for drug repurposing from existing drugs developed for other diseases due to the high investment, high risk, and long cycle involved in RD drug development. Compared to traditional approaches, knowledge graph embedding (KGE) based methods are more efficient and convenient, as they treat drug repurposing as a link prediction task. KGE models allow for the enrichment of existing knowledge by incorporating multimodal information from various sources. In this study, we constructed RDKG-115, a rare disease knowledge graph involving 115 RDs, composed of 35,643 entities, 25 relations, and 5,539,839 refined triplets, based on 372,384 high-quality literature and 4 biomedical datasets: DRKG, Pathway Commons, PharmKG, and PMapp. Subsequently, we developed a trimodal KGE model containing structure, category, and description embeddings using reverse-hyperplane projection. We utilized this model to infer 4199 reliable new inferred triplets from RDKG-115. Finally, we calculated potential drugs and small molecules for each of the 115 RDs, taking multiple sclerosis as a case study. This study provides a paradigm for large-scale screening of drug repurposing and discovery for RDs, which will speed up the drug development process and ultimately benefit patients with RDs. The source code and data are available at https://github.com/ZhuChaoY/RDKG-115.

摘要

罕见病(RDs)可能影响少数个体,但在全球范围内具有重大影响。准确诊断 RDs 具有挑战性,并且治疗药物严重缺乏。由于 RD 药物开发涉及高投资、高风险和长周期,制药公司更倾向于从其他疾病开发的现有药物中进行药物重定位。与传统方法相比,基于知识图嵌入(KGE)的方法更加高效和便捷,因为它们将药物重定位视为链接预测任务。KGE 模型通过合并来自各种来源的多模态信息来丰富现有知识。在这项研究中,我们构建了 RDKG-115,这是一个罕见病知识图,涉及 115 种罕见病,由 35643 个实体、25 种关系和 5539839 个细化三元组组成,基于 372384 篇高质量文献和 4 个生物医学数据集:DRKG、Pathway Commons、PharmKG 和 PMapp。随后,我们开发了一个三模态 KGE 模型,包含结构、类别和描述嵌入,使用反向超平面投影。我们利用该模型从 RDKG-115 中推断出 4199 个可靠的新推断三元组。最后,我们计算了每个 115 种罕见病的潜在药物和小分子,以多发性硬化症为例进行了研究。本研究为大规模筛选 RD 的药物重定位和发现提供了范例,将加速药物开发过程,最终使 RD 患者受益。源代码和数据可在 https://github.com/ZhuChaoY/RDKG-115 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验