Suppr超能文献

基于磁共振影像学报告的自然语言处理预测弥漫性脑胶质瘤异柠檬酸脱氢酶基因型

Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports.

机构信息

Department of Radiology and Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, Korea.

Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea.

出版信息

Eur Radiol. 2023 Nov;33(11):8017-8025. doi: 10.1007/s00330-023-10061-z. Epub 2023 Aug 11.

Abstract

OBJECTIVES

To evaluate the performance of natural language processing (NLP) models to predict isocitrate dehydrogenase (IDH) mutation status in diffuse glioma using routine MR radiology reports.

MATERIALS AND METHODS

This retrospective, multi-center study included consecutive patients with diffuse glioma with known IDH mutation status from May 2009 to November 2021 whose initial MR radiology report was available prior to pathologic diagnosis. Five NLP models (long short-term memory [LSTM], bidirectional LSTM, bidirectional encoder representations from transformers [BERT], BERT graph convolutional network [GCN], BioBERT) were trained, and area under the receiver operating characteristic curve (AUC) was assessed to validate prediction of IDH mutation status in the internal and external validation sets. The performance of the best performing NLP model was compared with that of the human readers.

RESULTS

A total of 1427 patients (mean age ± standard deviation, 54 ± 15; 779 men, 54.6%) with 720 patients in the training set, 180 patients in the internal validation set, and 527 patients in the external validation set were included. In the external validation set, BERT GCN showed the highest performance (AUC 0.85, 95% CI 0.81-0.89) in predicting IDH mutation status, which was higher than LSTM (AUC 0.77, 95% CI 0.72-0.81; p = .003) and BioBERT (AUC 0.81, 95% CI 0.76-0.85; p = .03). This was higher than that of a neuroradiologist (AUC 0.80, 95% CI 0.76-0.84; p = .005) and a neurosurgeon (AUC 0.79, 95% CI 0.76-0.84; p = .04).

CONCLUSION

BERT GCN was externally validated to predict IDH mutation status in patients with diffuse glioma using routine MR radiology reports with superior or at least comparable performance to human reader.

CLINICAL RELEVANCE STATEMENT

Natural language processing may be used to extract relevant information from routine radiology reports to predict cancer genotype and provide prognostic information that may aid in guiding treatment strategy and enabling personalized medicine.

KEY POINTS

• A transformer-based natural language processing (NLP) model predicted isocitrate dehydrogenase mutation status in diffuse glioma with an AUC of 0.85 in the external validation set. • The best NLP models were superior or at least comparable to human readers in both internal and external validation sets. • Transformer-based models showed higher performance than conventional NLP model such as long short-term memory.

摘要

目的

评估自然语言处理(NLP)模型在使用常规磁共振成像(MR)报告预测弥漫性神经胶质瘤异柠檬酸脱氢酶(IDH)突变状态方面的性能。

材料与方法

本回顾性多中心研究纳入了 2009 年 5 月至 2021 年 11 月间已知 IDH 突变状态的连续弥漫性神经胶质瘤患者,这些患者在病理诊断前均有初始的 MR 放射学报告。共训练了 5 种 NLP 模型(长短时记忆 [LSTM]、双向 LSTM、来自转换器的双向编码器表示 [BERT]、BERT 图卷积网络 [BERT GCN]、BioBERT),通过评估受试者工作特征曲线(ROC)下面积(AUC),对内部和外部验证集中 IDH 突变状态的预测进行验证。比较了表现最佳的 NLP 模型与人类读者的性能。

结果

共纳入 1427 例患者(平均年龄±标准差,54±15 岁;779 例男性,54.6%),其中 720 例患者来自训练集,180 例患者来自内部验证集,527 例患者来自外部验证集。在外部验证集中,BERT GCN 在预测 IDH 突变状态方面表现最佳(AUC 0.85,95%CI 0.81-0.89),优于 LSTM(AUC 0.77,95%CI 0.72-0.81;p=0.003)和 BioBERT(AUC 0.81,95%CI 0.76-0.85;p=0.03)。优于神经放射科医师(AUC 0.80,95%CI 0.76-0.84;p=0.005)和神经外科医师(AUC 0.79,95%CI 0.76-0.84;p=0.04)。

结论

BERT GCN 在使用常规 MR 放射学报告预测弥漫性神经胶质瘤患者的 IDH 突变状态方面进行了外部验证,其性能优于或至少与人类读者相当。

临床相关性

自然语言处理可用于从常规放射学报告中提取相关信息,以预测癌症基因型,并提供预后信息,这可能有助于指导治疗策略并实现个性化医疗。

要点

  • 基于转换器的自然语言处理(NLP)模型在外部验证集中对弥漫性神经胶质瘤的异柠檬酸脱氢酶突变状态进行预测,AUC 为 0.85。

  • 最佳 NLP 模型在内部和外部验证集中均优于或至少与人类读者相当。

  • 基于转换器的模型比传统的 NLP 模型(如长短时记忆)表现更好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验