基于具有句法依存特征的多头注意力模型的生物医学文本关系抽取：建模研究

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study.

作者信息

Li Yongbin, Hui Linhu, Zou Liping, Li Huyang, Xu Luo, Wang Xiaohua, Chua Stephanie

机构信息

School of Medical Information Engineering, Zunyi Medical University, Zunyi, China.

Faculty of Computer Science and Information Technology, University Malaysia Sarawak, Sarawak, Malaysia.

出版信息

JMIR Med Inform. 2022 Oct 20;10(10):e41136. doi: 10.2196/41136.

DOI:10.2196/41136

PMID:36264604

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9634522/

Abstract

BACKGROUND

With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic.

OBJECTIVE

This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem.

METHODS

Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation.

RESULTS

Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F score of our model is superior to that of other existing models and achieves state-of-the-art performance.

CONCLUSIONS

We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts.

摘要

背景

随着生物医学文献的迅速增长，生物医学信息提取已引起研究人员越来越多的关注。特别是，两个实体之间的关系提取是一个长期的研究课题。

目的

本研究旨在执行2019年生物医学自然语言处理研讨会开放共享任务中的两个多类关系提取任务：细菌-生物群落关系提取（BB-rel）任务和植物种子发育二元关系提取（SeeDev-binary）任务。从本质上讲，这两个任务旨在从生物医学文本中提取注释实体对之间的关系，这是一个具有挑战性的问题。

方法

传统研究方法采用基于特征或核的方法，并取得了良好的性能。对于这些任务，我们提出了一种基于多种分布式特征组合的深度学习模型，如特定领域词嵌入、词性嵌入、实体类型嵌入、距离嵌入和位置嵌入。多头注意力机制用于提取整个句子的全局语义特征。同时，我们引入了依存类型特征和句法依存图中连接两个候选实体的最短依存路径，以丰富特征表示。

结果

实验表明，我们提出的模型在生物医学关系提取方面具有优异的性能，在BB-rel和SeeDev-binary任务的测试集上分别达到了65.56%和38.04%的F分数。特别是在SeeDev-binary任务中，我们模型的F分数优于其他现有模型，并达到了当前的最优性能。

结论

我们证明了多头注意力机制可以在不同的表示子空间和不同位置学习相关的句法和语义特征，以提取综合特征表示。此外，句法依存特征可以通过学习生物医学文本中实体之间的依存关系来提高模型的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a32f/9634522/93657783d200/medinform_v10i10e41136_fig1.jpg

相似文献

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study.基于具有句法依存特征的多头注意力模型的生物医学文本关系抽取：建模研究

JMIR Med Inform. 2022 Oct 20;10(10):e41136. doi: 10.2196/41136.

Extracting biomedical relation from cross-sentence text using syntactic dependency graph attention network.基于句法依存图注意力网络的跨句文本生物医学关系抽取

J Biomed Inform. 2023 Aug;144:104445. doi: 10.1016/j.jbi.2023.104445. Epub 2023 Jul 17.

Document-Level Biomedical Relation Extraction Using Graph Convolutional Network and Multihead Attention: Algorithm Development and Validation.使用图卷积网络和多头注意力的文档级生物医学关系抽取：算法开发与验证

JMIR Med Inform. 2020 Jul 31;8(7):e17638. doi: 10.2196/17638.

Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text.将最短依赖路径和句子序列集成到深度学习框架中，用于临床文本中的关系抽取。

BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):22. doi: 10.1186/s12911-019-0736-9.

Relation extraction between bacteria and biotopes from biomedical texts with attention mechanisms and domain-specific contextual representations.基于注意力机制和领域特定上下文表示的生物医学文本中细菌与生物栖息地的关系抽取。

BMC Bioinformatics. 2019 Dec 3;20(1):627. doi: 10.1186/s12859-019-3217-3.

Integrating graph convolutional networks to enhance prompt learning for biomedical relation extraction.将图卷积网络集成到提示学习中，以增强生物医学关系抽取。

J Biomed Inform. 2024 Sep;157:104717. doi: 10.1016/j.jbi.2024.104717. Epub 2024 Aug 28.

BioByGANS: biomedical named entity recognition by fusing contextual and syntactic features through graph attention network in node classification framework.BioByGANS：通过图注意力网络在节点分类框架中融合上下文和句法特征进行生物医学命名实体识别。

BMC Bioinformatics. 2022 Nov 22;23(1):501. doi: 10.1186/s12859-022-05051-9.

Knowledge Guided Attention and Graph Convolutional Networks for Chemical-Disease Relation Extraction.知识引导注意力与图卷积网络在药物-疾病关系抽取中的应用

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):489-499. doi: 10.1109/TCBB.2021.3135844. Epub 2023 Feb 3.

Chemical-induced disease relation extraction with dependency information and prior knowledge.基于依存信息和先验知识的化学诱导疾病关系抽取。

J Biomed Inform. 2018 Aug;84:171-178. doi: 10.1016/j.jbi.2018.07.007. Epub 2018 Jul 11.

Exploiting graph kernels for high performance biomedical relation extraction.利用图核进行高性能生物医学关系提取。

J Biomed Semantics. 2018 Jan 30;9(1):7. doi: 10.1186/s13326-017-0168-3.

引用本文的文献

Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validation.增强中医药随机对照试验证据提取的自动化信息提取模型（Evi-BERT）：算法开发与验证

Front Artif Intell. 2024 Aug 15;7:1454945. doi: 10.3389/frai.2024.1454945. eCollection 2024.

Elucidating the semantics-topology trade-off for knowledge inference-based pharmacological discovery.阐明基于知识推理的药物发现中的语义-拓扑权衡。

J Biomed Semantics. 2024 May 1;15(1):5. doi: 10.1186/s13326-024-00308-z.

本文引用的文献

BMC Bioinformatics. 2019 Dec 3;20(1):627. doi: 10.1186/s12859-019-3217-3.

Biomedical event extraction based on GRU integrating attention mechanism.基于 GRU 集成注意力机制的生物医学事件抽取。

BMC Bioinformatics. 2018 Aug 13;19(Suppl 9):285. doi: 10.1186/s12859-018-2275-2.

Extracting chemical-protein relations using attention-based neural networks.基于注意力机制神经网络的化学-蛋白质关系抽取。

Database (Oxford). 2018 Jan 1;2018:bay102. doi: 10.1093/database/bay102.

Drug-drug interaction extraction from biomedical texts using long short-term memory network.基于长短时记忆网络的生物医学文献中药物-药物相互作用提取

J Biomed Inform. 2018 Oct;86:15-24. doi: 10.1016/j.jbi.2018.08.005. Epub 2018 Aug 21.

Drug-drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths.基于序列和最短依赖路径的分层 RNN 进行药物-药物相互作用提取。

Bioinformatics. 2018 Mar 1;34(5):828-835. doi: 10.1093/bioinformatics/btx659.

An attention-based effective neural model for drug-drug interactions extraction.一种基于注意力机制的有效神经模型用于药物-药物相互作用提取。

BMC Bioinformatics. 2017 Oct 10;18(1):445. doi: 10.1186/s12859-017-1855-x.

A Shortest Dependency Path Based Convolutional Neural Network for Protein-Protein Relation Extraction.基于最短依赖路径的卷积神经网络在蛋白质-蛋白质关系抽取中的应用。

Biomed Res Int. 2016;2016:8479587. doi: 10.1155/2016/8479587. Epub 2016 Jul 14.

Drug-Drug Interaction Extraction via Convolutional Neural Networks.通过卷积神经网络进行药物-药物相互作用提取

Comput Math Methods Med. 2016;2016:6918381. doi: 10.1155/2016/6918381. Epub 2016 Jan 31.

Resolving anaphoras for the extraction of drug-drug interactions in pharmacological documents.解决药理学文献中药物-药物相互作用提取的回指问题。

BMC Bioinformatics. 2010 Apr 16;11 Suppl 2(Suppl 2):S1. doi: 10.1186/1471-2105-11-S2-S1.

Frontiers of biomedical text mining: current progress.生物医学文本挖掘前沿：当前进展

Brief Bioinform. 2007 Sep;8(5):358-75. doi: 10.1093/bib/bbm045. Epub 2007 Oct 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于具有句法依存特征的多头注意力模型的生物医学文本关系抽取：建模研究

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献