School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China.
Pengcheng Laboratory, Shenzhen, China.
Math Biosci Eng. 2022 Jul 27;19(10):10656-10672. doi: 10.3934/mbe.2022498.
Extracting relational triples from unstructured medical texts can provide a basis for the construction of large-scale medical knowledge graphs. The cascade binary pointer tagging network (CBPTN) shows excellent performance in the joint entity and relation extraction, so we try to explore its effectiveness in the joint entity and relation extraction of Chinese medical texts. In this paper, we propose two models based on the CBPTN: CBPTN with conditional layer normalization (Cas-CLN) and biaffine transformation-based CBPTN with multi-head selection (BTCAMS). Cas-CLN uses the CBPTN to decode the head entity and relation-tail entity successively and utilizes conditional layer normalization to enhance the connection between the two steps. BTCAMS detects all possible entities in a sentence by using the CBPTN and then determines the relation between each entity pair through biaffine transformation. We test the performance of the two models on two Chinese medical datasets: CMeIE and CEMRDS. The experimental results prove the effectiveness of the two models. Compared with the baseline CasREL, the F1 value of Cas-CLN and BTCAMS on the test data of CMeIE improved by 1.01 and 2.13%; on the test data of CEMRDS, the F1 value improved by 1.99 and 0.68%.
从非结构化的医学文本中提取关系三元组可为大规模医学知识图谱的构建提供基础。级联二进制指针标注网络(CBPTN)在联合实体和关系抽取方面表现出色,因此我们尝试探索其在中文医学文本的联合实体和关系抽取中的有效性。在本文中,我们提出了两个基于 CBPTN 的模型:带条件层归一化的 CBPTN(Cas-CLN)和基于多头选择的带双线性变换的 CBPTN(BTCAMS)。Cas-CLN 采用 CBPTN 依次解码头实体和关系尾实体,并利用条件层归一化增强两步之间的连接。BTCAMS 通过 CBPTN 检测句子中的所有可能实体,然后通过双线性变换确定每个实体对之间的关系。我们在两个中文医学数据集 CMeIE 和 CEMRDS 上测试了两个模型的性能。实验结果证明了这两个模型的有效性。与基线 CasREL 相比,Cas-CLN 和 BTCAMS 在 CMeIE 测试数据上的 F1 值分别提高了 1.01 和 2.13%;在 CEMRDS 测试数据上,F1 值分别提高了 1.99 和 0.68%。