Suppr超能文献

从 PET-CT 解读的非结构化报告中自动提取肺癌分期信息:基于深度学习的自然语言处理。

Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning.

机构信息

Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul, 05505, South Korea.

Department of Information Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea.

出版信息

BMC Med Inform Decis Mak. 2022 Sep 1;22(1):229. doi: 10.1186/s12911-022-01975-7.

Abstract

BACKGROUND

Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis information from PET-CT reports for determining lung cancer stages.

METHODS

PET-CT reports, fully written in English, were acquired from two cohorts of patients with lung cancer who were diagnosed at a tertiary hospital between January 2004 and March 2020. One cohort of 20,466 PET-CT reports was used for training and the validation set, and the other cohort of 4190 PET-CT reports was used for an additional-test set. A pre-processing model (Lung Cancer Spell Checker) was applied to correct the typographical errors, and pseudo-labelling was used for training the model. The deep-learning model was constructed using the Convolutional-Recurrent Neural Network. The performance metrics for the prediction model were accuracy, precision, sensitivity, micro-AUROC, and AUPRC.

RESULTS

For the extraction of primary lung cancer location, the model showed a micro-AUROC of 0.913 and 0.946 in the validation set and the additional-test set, respectively. For metastatic lymph nodes, the model showed a sensitivity of 0.827 and a specificity of 0.960. In predicting distant metastasis, the model showed a micro-AUROC of 0.944 and 0.950 in the validation and the additional-test set, respectively.

CONCLUSION

Our deep-learning method could be used for extracting lung cancer stage information from PET-CT reports and may facilitate lung cancer studies by alleviating laborious annotation by clinicians.

摘要

背景

从先前的放射学文本报告中提取转移信息很重要,然而,繁琐的注释限制了这些文本的可用性。我们开发了一种深度学习模型,用于从 PET-CT 报告中提取原发性肺癌部位和转移性淋巴结以及远处转移信息,以确定肺癌分期。

方法

从 2004 年 1 月至 2020 年 3 月在一家三级医院诊断为肺癌的两批患者中获取了完全用英语书写的 PET-CT 报告。一个队列的 20,466 份 PET-CT 报告用于训练和验证集,另一个队列的 4190 份 PET-CT 报告用于附加测试集。应用预处理模型(肺癌拼写检查器)纠正打字错误,并进行伪标记以训练模型。使用卷积递归神经网络构建深度学习模型。该预测模型的性能指标包括准确性、精确性、敏感性、微 AUROC 和 AUPRC。

结果

对于原发性肺癌位置的提取,该模型在验证集和附加测试集中的微 AUROC 分别为 0.913 和 0.946。对于转移性淋巴结,该模型的敏感性为 0.827,特异性为 0.960。在预测远处转移方面,该模型在验证集和附加测试集中的微 AUROC 分别为 0.944 和 0.950。

结论

我们的深度学习方法可用于从 PET-CT 报告中提取肺癌分期信息,并通过减轻临床医生繁琐的注释工作,为肺癌研究提供便利。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d853/9438247/b9250b3e6e80/12911_2022_1975_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验