Reichenpfader Daniel, Knupp Jonas, von Däniken Sandro Urs, Gaio Roberto, Dennstädt Fabio, Cereghetti Grazia Maria, Sander André, Hiltbrunner Hans, Nairz Knud, Denecke Kerstin
Institute for Patient-Centered Digital Health, School of Engineering and Computer Science, Bern University of Applied Sciences, Biel/Bienne, Switzerland.
PhD School of Life Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
J Med Internet Res. 2025 Apr 25;27:e68427. doi: 10.2196/68427.
Structured reporting is essential for improving the clarity and accuracy of radiological information. Despite its benefits, the European Society of Radiology notes that it is not widely adopted. For example, while structured reporting frameworks such as the Breast Imaging Reporting and Data System provide standardized terminology and classification for mammography findings, radiology reports still mostly comprise free-text sections. This variability complicates the systematic extraction of key clinical data. Moreover, manual structuring of reports is time-consuming and prone to inconsistencies. Recent advancements in large language models have shown promise for clinical information extraction by enabling models to understand contextual nuances in medical text. However, challenges such as domain adaptation, privacy concerns, and generalizability remain. To address these limitations, frame semantics offers an approach to information extraction grounded in computational linguistics, allowing a structured representation of clinically relevant concepts.
This study explores the combination of Bidirectional Encoder Representations from Transformers (BERT) architecture with the linguistic concept of frame semantics to extract and normalize information from free-text mammography reports.
After creating an annotated corpus of 210 German reports for fine-tuning, we generate several BERT model variants by applying 3 pretraining strategies to hospital data. Afterward, a fact extraction pipeline is built, comprising an extractive question-answering model and a sequence labeling model. We quantitatively evaluate all model variants using common evaluation metrics (model perplexity, Stanford Question Answering Dataset 2.0 [SQuAD_v2], seqeval) and perform a qualitative clinician evaluation of the entire pipeline on a manually generated synthetic dataset of 21 reports, as well as a comparison with a generative approach following best practice prompting techniques using the open-source Llama 3.3 model (Meta).
Our system is capable of extracting 14 fact types and 40 entities from the clinical findings section of mammography reports. Further pretraining on hospital data reduced model perplexity, although it did not significantly impact the 2 downstream tasks. We achieved average F-scores of 90.4% and 81% for question answering and sequence labeling, respectively (best pretraining strategy). Qualitative evaluation of the pipeline based on synthetic data shows an overall precision of 96.1% and 99.6% for facts and entities, respectively. In contrast, generative extraction shows an overall precision of 91.2% and 87.3% for facts and entities, respectively. Hallucinations and extraction inconsistencies were observed.
This study demonstrates that frame semantics provides a robust and interpretable framework for automating structured reporting. By leveraging frame semantics, the approach enables customizable information extraction and supports generalization to diverse radiological domains and clinical contexts with additional annotation efforts. Furthermore, the BERT-based model architecture allows for efficient, on-premise deployment, ensuring data privacy. Future research should focus on validating the model's generalizability across external datasets and different report types to ensure its broader applicability in clinical practice.
结构化报告对于提高放射学信息的清晰度和准确性至关重要。尽管有诸多益处,但欧洲放射学会指出其尚未得到广泛应用。例如,虽然像乳腺影像报告和数据系统这样的结构化报告框架为乳腺钼靶检查结果提供了标准化术语和分类,但放射学报告仍大多由自由文本部分组成。这种可变性使得关键临床数据的系统提取变得复杂。此外,报告的手动结构化既耗时又容易出现不一致性。大语言模型的最新进展通过使模型能够理解医学文本中的上下文细微差别,在临床信息提取方面显示出了前景。然而,诸如领域适应、隐私问题和通用性等挑战仍然存在。为解决这些局限性,框架语义学提供了一种基于计算语言学的信息提取方法,允许对临床相关概念进行结构化表示。
本研究探索将来自变换器的双向编码器表示(BERT)架构与框架语义学的语言概念相结合,以从自由文本乳腺钼靶报告中提取和规范化信息。
在创建了一个包含210份德语报告的注释语料库用于微调之后,我们通过对医院数据应用3种预训练策略生成了几个BERT模型变体。之后,构建了一个事实提取管道,包括一个抽取式问答模型和一个序列标注模型。我们使用常见评估指标(模型困惑度、斯坦福问答数据集2.0 [SQuAD_v2]、seqeval)对所有模型变体进行定量评估,并在一个由21份报告组成的手动生成的合成数据集上对整个管道进行定性临床医生评估,以及与使用开源Llama 3.3模型(Meta)的遵循最佳实践提示技术的生成式方法进行比较。
我们的系统能够从乳腺钼靶报告的临床发现部分提取14种事实类型和40个实体。在医院数据上进一步预训练降低了模型困惑度,尽管它对两个下游任务没有显著影响。对于问答和序列标注,我们分别取得了90.4%和81%的平均F分数(最佳预训练策略)。基于合成数据对管道的定性评估显示,事实和实体的总体精度分别为96.1%和99.6%。相比之下,生成式提取显示事实和实体的总体精度分别为91.2%和87.3%。观察到了幻觉和提取不一致的情况。
本研究表明,框架语义学为自动化结构化报告提供了一个强大且可解释的框架。通过利用框架语义学,该方法能够实现可定制的信息提取,并通过额外的注释工作支持向不同放射学领域和临床环境的泛化。此外,基于BERT的模型架构允许进行高效的本地部署,确保数据隐私。未来的研究应侧重于验证模型在外部数据集和不同报告类型上的通用性,以确保其在临床实践中的更广泛适用性。