Suppr超能文献

基于集成学习和注意力机制的生物医学关系抽取方法。

Biomedical relation extraction method based on ensemble learning and attention mechanism.

机构信息

Department of Radiation Oncology, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China.

Department of Computer College, Beijing Information Science and Technology University, Beijing, China.

出版信息

BMC Bioinformatics. 2024 Oct 18;25(1):333. doi: 10.1186/s12859-024-05951-y.

Abstract

BACKGROUND

Relation extraction (RE) plays a crucial role in biomedical research as it is essential for uncovering complex semantic relationships between entities in textual data. Given the significance of RE in biomedical informatics and the increasing volume of literature, there is an urgent need for advanced computational models capable of accurately and efficiently extracting these relationships on a large scale.

RESULTS

This paper proposes a novel approach, SARE, combining ensemble learning Stacking and attention mechanisms to enhance the performance of biomedical relation extraction. By leveraging multiple pre-trained models, SARE demonstrates improved adaptability and robustness across diverse domains. The attention mechanisms enable the model to capture and utilize key information in the text more accurately. SARE achieved performance improvements of 4.8, 8.7, and 0.8 percentage points on the PPI, DDI, and ChemProt datasets, respectively, compared to the original BERT variant and the domain-specific PubMedBERT model.

CONCLUSIONS

SARE offers a promising solution for improving the accuracy and efficiency of relation extraction tasks in biomedical research, facilitating advancements in biomedical informatics. The results suggest that combining ensemble learning with attention mechanisms is effective for extracting complex relationships from biomedical texts. Our code and data are publicly available at: https://github.com/GS233/Biomedical .

摘要

背景

关系抽取(RE)在生物医学研究中起着至关重要的作用,因为它对于揭示文本数据中实体之间复杂的语义关系至关重要。鉴于 RE 在生物医学信息学中的重要性以及文献数量的不断增加,迫切需要先进的计算模型,能够大规模准确且高效地提取这些关系。

结果

本文提出了一种新颖的方法 SARE,结合集成学习 Stacking 和注意力机制来提高生物医学关系抽取的性能。通过利用多个预训练模型,SARE 展示了在不同领域的更好的适应性和鲁棒性。注意力机制使模型能够更准确地捕获和利用文本中的关键信息。与原始的 BERT 变体和特定于领域的 PubMedBERT 模型相比,SARE 在 PPI、DDI 和 ChemProt 数据集上分别实现了 4.8、8.7 和 0.8 个百分点的性能提升。

结论

SARE 为提高生物医学研究中关系抽取任务的准确性和效率提供了有前途的解决方案,促进了生物医学信息学的发展。结果表明,结合集成学习和注意力机制对于从生物医学文本中提取复杂关系是有效的。我们的代码和数据可在以下网址获取:https://github.com/GS233/Biomedical。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d828/11488084/d55ec3b012f6/12859_2024_5951_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验