School of Artificial Intelligence, Sun Yat-sen University, Zhuhai, China.
School of Information Management, Sun Yat-sen University, Guangzhou, China.
Health Informatics J. 2024 Jul-Sep;30(3):14604582241274762. doi: 10.1177/14604582241274762.
Currently, the primary challenges in entity relation extraction are the existence of overlapping relations and cascading errors. In addressing these issues, both CasRel and TPLinker have demonstrated their competitiveness. This study aims to explore the application of these two models in the context of entity relation extraction from Chinese medical text. We evaluate the performance of these models using the publicly available dataset CMeIE and further enhance their capabilities through the incorporation of pre-trained models that are tailored to the specific characteristics of the text. The experimental findings demonstrate that the TPLinker model exhibits a heightened and consistent boosting effect compared to CasRel, while also attaining superior performance through the utilization of advanced pre-trained models. Notably, the MacBERT + TPLinker combination emerges as the optimal choice, surpassing the benchmark model by 12.45% and outperforming the leading model ERNIE-Health 3.0 in the CBLUE challenge by 2.31%.
目前,实体关系抽取中主要面临的挑战是存在重叠关系和级联错误。在解决这些问题方面,CasRel 和 TPLinker 都表现出了竞争力。本研究旨在探索这两种模型在中文医学文本实体关系抽取中的应用。我们使用公开的 CMeIE 数据集评估这些模型的性能,并通过结合针对文本特定特征的预训练模型来增强它们的能力。实验结果表明,与 CasRel 相比,TPLinker 模型表现出更高且更一致的提升效果,同时通过使用先进的预训练模型也获得了更优的性能。值得注意的是,MacBERT+TPLinker 的组合是最佳选择,比基准模型高出 12.45%,在 CBLUE 挑战中比领先的 ERNIE-Health 3.0 模型高出 2.31%。