Suppr超能文献

基于有限标注的英文和日文病例/放射学报告的跨语言自然语言处理:来自Real-MedNLP研讨会的见解。

Cross-lingual Natural Language Processing on Limited Annotated Case/Radiology Reports in English and Japanese: Insights from the Real-MedNLP Workshop.

作者信息

Yada Shuntaro, Nakamura Yuta, Wakamiya Shoko, Aramaki Eiji

机构信息

Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan.

22nd Century Medical and Research Center, The University of Tokyo Hospital, Tokyo, Japan.

出版信息

Methods Inf Med. 2024 Oct 29. doi: 10.1055/a-2405-2489.

Abstract

BACKGROUND

Textual datasets (corpora) are crucial for the application of natural language processing (NLP) models. However, corpus creation in the medical field is challenging, primarily because of privacy issues with raw clinical data such as health records. Thus, the existing clinical corpora are generally small and scarce. Medical NLP (MedNLP) methodologies perform well with limited data availability.

OBJECTIVES

We present the outcomes of the Real-MedNLP workshop, which was conducted using limited and parallel medical corpora. Real-MedNLP exhibits three distinct characteristics: (1) limited annotated documents: the training data comprise only a small set (∼100) of case reports (CRs) and radiology reports (RRs) that have been annotated. (2) Bilingually parallel: the constructed corpora are parallel in Japanese and English. (3) Practical tasks: the workshop addresses fundamental tasks, such as named entity recognition (NER) and applied practical tasks.

METHODS

We propose three tasks: NER of ∼100 available documents (Task 1), NER based only on annotation guidelines for humans (Task 2), and clinical applications (Task 3) consisting of adverse drug effect (ADE) detection for CRs and identical case identification (CI) for RRs.

RESULTS

Nine teams participated in this study. The best systems achieved 0.65 and 0.89 F1-scores for CRs and RRs in Task 1, whereas the top scores in Task 2 decreased by 50 to 70%. In Task 3, ADE reports were detected by up to 0.64 F1-score, and CI scored up to 0.96 binary accuracy.

CONCLUSION

Most systems adopt medical-domain-specific pretrained language models using data augmentation methods. Despite the challenge of limited corpus size in Tasks 1 and 2, recent approaches are promising because the partial match scores reached ∼0.8-0.9 F1-scores. Task 3 applications revealed that the different availabilities of external language resources affected the performance per language.

摘要

背景

文本数据集(语料库)对于自然语言处理(NLP)模型的应用至关重要。然而,医学领域的语料库创建具有挑战性,主要是因为原始临床数据(如健康记录)存在隐私问题。因此,现有的临床语料库通常规模较小且数量稀少。医学NLP(MedNLP)方法在数据可用性有限的情况下表现良好。

目的

我们展示了使用有限且平行的医学语料库举办的Real-MedNLP研讨会的成果。Real-MedNLP具有三个显著特点:(1)注释文档有限:训练数据仅包括一小部分(约100个)已注释的病例报告(CR)和放射学报告(RR)。(2)双语平行:构建的语料库在日语和英语方面是平行的。(3)实际任务:该研讨会涉及命名实体识别(NER)等基础任务以及应用实际任务。

方法

我们提出了三项任务:对约100份可用文档进行NER(任务1)、仅基于人工注释指南进行NER(任务2)以及由CR的药物不良反应(ADE)检测和RR的相同病例识别(CI)组成的临床应用(任务3)。

结果

九个团队参与了本研究。最佳系统在任务1中对CR和RR的F1分数分别达到0.65和0.89,而任务2中的最高分下降了50%至70%。在任务3中,ADE报告的检测F1分数高达0.64,CI的二元准确率高达0.96。

结论

大多数系统采用使用数据增强方法的医学领域特定预训练语言模型。尽管任务1和任务2中语料库规模有限带来了挑战,但最近的方法很有前景,因为部分匹配分数达到了约0.8 - 0.9的F1分数。任务3的应用表明,外部语言资源的不同可用性影响了每种语言的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验