Suppr超能文献

从放射学报告中进行弱监督空间关系提取。

Weakly supervised spatial relation extraction from radiology reports.

作者信息

Datta Surabhi, Roberts Kirk

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA.

出版信息

JAMIA Open. 2023 Apr 22;6(2):ooad027. doi: 10.1093/jamiaopen/ooad027. eCollection 2023 Jul.

Abstract

OBJECTIVE

Weak supervision holds significant promise to improve clinical natural language processing by leveraging domain resources and expertise instead of large manually annotated datasets alone. Here, our objective is to evaluate a weak supervision approach to extract spatial information from radiology reports.

MATERIALS AND METHODS

Our weak supervision approach is based on data programming that uses rules (or labeling functions) relying on domain-specific dictionaries and radiology language characteristics to generate weak labels. The labels correspond to different spatial relations that are critical to understanding radiology reports. These weak labels are then used to fine-tune a pretrained Bidirectional Encoder Representations from Transformers (BERT) model.

RESULTS

Our weakly supervised BERT model provided satisfactory results in extracting spatial relations without manual annotations for training (spatial trigger F1: 72.89, relation F1: 52.47). When this model is further fine-tuned on manual annotations (relation F1: 68.76), performance surpasses the fully supervised state-of-the-art.

DISCUSSION

To our knowledge, this is the first work to automatically create detailed weak labels corresponding to radiological information of clinical significance. Our data programming approach is (1) adaptable as the labeling functions can be updated with relatively little manual effort to incorporate more variations in radiology language reporting formats and (2) generalizable as these functions can be applied across multiple radiology subdomains in most cases.

CONCLUSIONS

We demonstrate a weakly supervision model performs sufficiently well in identifying a variety of relations from radiology text without manual annotations, while exceeding state-of-the-art results when annotated data are available.

摘要

目的

弱监督通过利用领域资源和专业知识而非仅依靠大型人工标注数据集,在改善临床自然语言处理方面具有巨大潜力。在此,我们的目标是评估一种从放射学报告中提取空间信息的弱监督方法。

材料与方法

我们的弱监督方法基于数据编程,该编程使用依赖于特定领域词典和放射学语言特征的规则(或标注函数)来生成弱标签。这些标签对应于理解放射学报告至关重要的不同空间关系。然后,这些弱标签用于微调预训练的来自变换器的双向编码器表征(BERT)模型。

结果

我们的弱监督BERT模型在无需人工标注进行训练的情况下,在提取空间关系方面取得了令人满意的结果(空间触发F1值:72.89,关系F1值:52.47)。当该模型在人工标注上进一步微调时(关系F1值:68.76),性能超过了完全监督的当前最优方法。

讨论

据我们所知,这是第一项自动创建与具有临床意义的放射学信息相对应的详细弱标签的工作。我们的数据编程方法具有以下特点:(1)具有适应性,因为标注函数可以通过相对较少的人工努力进行更新,以纳入放射学语言报告格式中的更多变化;(2)具有通用性,因为在大多数情况下,这些函数可以应用于多个放射学子领域。

结论

我们证明了一个弱监督模型在无需人工标注的情况下,从放射学文本中识别各种关系方面表现良好,而在有标注数据时,其性能超过了当前最优结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1f0/10122604/f8cdcefbb1f6/ooad027f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验