Suppr超能文献

生物知识图谱绘制工具:对从生物医学文献中自动构建知识图谱的初步评估。

BioKGrapher: Initial evaluation of automated knowledge graph construction from biomedical literature.

作者信息

Schäfer Henning, Idrissi-Yaghir Ahmad, Arzideh Kamyar, Damm Hendrik, Pakull Tabea M G, Schmidt Cynthia S, Bahn Mikel, Lodde Georg, Livingstone Elisabeth, Schadendorf Dirk, Nensa Felix, Horn Peter A, Friedrich Christoph M

机构信息

Institute for Transfusion Medicine, University Hospital Essen, Hufelandstraße 55, Essen, 45147, Germany.

Department of Computer Science, University of Applied Sciences and Arts Dortmund (FHDO), Emil-Figge Str. 42, Dortmund, 44227, Germany.

出版信息

Comput Struct Biotechnol J. 2024 Oct 17;24:639-660. doi: 10.1016/j.csbj.2024.10.017. eCollection 2024 Dec.

Abstract

The growth of biomedical literature presents challenges in extracting and structuring knowledge. Knowledge Graphs (KGs) offer a solution by representing relationships between biomedical entities. However, manual construction of KGs is labor-intensive and time-consuming, highlighting the need for automated methods. This work introduces BioKGrapher, a tool for automatic KG construction using large-scale publication data, with a focus on biomedical concepts related to specific medical conditions. BioKGrapher allows researchers to construct KGs from PubMed IDs. The BioKGrapher pipeline begins with Named Entity Recognition and Linking (NER+NEL) to extract and normalize biomedical concepts from PubMed, mapping them to the Unified Medical Language System (UMLS). Extracted concepts are weighted and re-ranked using Kullback-Leibler divergence and local frequency balancing. These concepts are then integrated into hierarchical KGs, with relationships formed using terminologies like SNOMED CT and NCIt. Downstream applications include multi-label document classification using Adapter-infused Transformer models. BioKGrapher effectively aligns generated concepts with clinical practice guidelines from the German Guideline Program in Oncology (GGPO), achieving -Scores of up to 0.6. In multi-label classification, Adapter-infused models using a BioKGrapher cancer-specific KG improved micro -Scores by up to 0.89 percentage points over a non-specific KG and 2.16 points over base models across three BERT variants. The drug-disease extraction case study identified indications for Nivolumab and Rituximab. BioKGrapher is a tool for automatic KG construction, aligning with the GGPO and enhancing downstream task performance. It offers a scalable solution for managing biomedical knowledge, with potential applications in literature recommendation, decision support, and drug repurposing.

摘要

生物医学文献的增长给知识提取和结构化带来了挑战。知识图谱(KGs)通过表示生物医学实体之间的关系提供了一种解决方案。然而,手动构建知识图谱既费力又耗时,这凸显了自动化方法的必要性。这项工作介绍了BioKGrapher,这是一种利用大规模出版数据自动构建知识图谱的工具,重点关注与特定医疗状况相关的生物医学概念。BioKGrapher允许研究人员从PubMed ID构建知识图谱。BioKGrapher流程始于命名实体识别与链接(NER+NEL),从PubMed中提取并规范化生物医学概念,将它们映射到统一医学语言系统(UMLS)。使用Kullback-Leibler散度和局部频率平衡对提取的概念进行加权和重新排序。然后将这些概念整合到层次化知识图谱中,使用SNOMED CT和NCIt等术语形成关系。下游应用包括使用注入适配器的Transformer模型进行多标签文档分类。BioKGrapher有效地将生成的概念与德国肿瘤学指南计划(GGPO)的临床实践指南对齐,实现了高达0.6的-Scores。在多标签分类中,使用BioKGrapher癌症特异性知识图谱的注入适配器模型在三个BERT变体上,相比于非特异性知识图谱,微-Scores提高了高达0.89个百分点,相比于基础模型提高了2.16个百分点。药物-疾病提取案例研究确定了纳武单抗和利妥昔单抗的适应症。BioKGrapher是一种自动构建知识图谱的工具,与GGPO对齐并提高了下游任务性能。它为管理生物医学知识提供了一种可扩展的解决方案,在文献推荐、决策支持和药物重新利用方面具有潜在应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58ae/11536026/09394f26d3de/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验