College of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China.
College of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
Comput Biol Med. 2024 Jun;176:108539. doi: 10.1016/j.compbiomed.2024.108539. Epub 2024 Apr 29.
Nested entities and relationship extraction are two tasks for analysis of electronic medical records. However, most of existing medical information extraction models consider these tasks separately, resulting in a lack of consistency between them. In this paper, we propose a joint medical entity-relation extraction model with progressive recognition and targeted assignment (PRTA). Entities and relations share the information of sequence and word embedding layers in the joint decoding stage. They are trained simultaneously and realize information interaction by updating the shared parameters. Specifically, we design a compound triangle strategy for the nested entity recognition and an adaptive multi-space interactive strategy for relationship extraction. Then, we construct a parameter-shared information space based on semantic continuity to decode entities and relationships. Extensive experiments were conducted on the Private Liver Disease Dataset (PLDD) provided by Beijing Friendship Hospital of Capital Medical University and public datasets (NYT, ACE04 and ACE05). The results show that our method outperforms existing SOTA methods in most indicators, and effectively handles nested entities and overlapping relationships.
嵌套实体和关系抽取是分析电子病历的两项任务。然而,现有的大多数医学信息抽取模型分别考虑这两个任务,导致它们之间缺乏一致性。在本文中,我们提出了一种具有渐进式识别和针对性分配(PRTA)的联合医学实体-关系抽取模型。实体和关系在联合解码阶段共享序列和词嵌入层的信息。它们被同时训练,并通过更新共享参数来实现信息交互。具体来说,我们为嵌套实体识别设计了一种复合三角形策略,为关系抽取设计了一种自适应多空间交互策略。然后,我们构建了一个基于语义连续性的参数共享信息空间来解码实体和关系。我们在北京友谊医院提供的私有肝病数据集(PLDD)和公共数据集(NYT、ACE04 和 ACE05)上进行了广泛的实验。结果表明,我们的方法在大多数指标上优于现有的 SOTA 方法,有效地处理了嵌套实体和重叠关系。