检测肺癌患者的磨玻璃影特征：基于深度学习的自然语言处理实现自动提取与纵向分析

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

作者信息

Lee Kyeryoung, Liu Zongzhi, Chandran Urmila, Kalsekar Iftekhar, Laxmanan Balaji, Higashi Mitchell K, Jun Tomi, Ma Meng, Li Minghao, Mai Yun, Gilman Christopher, Wang Tongyu, Ai Lei, Aggarwal Parag, Pan Qi, Oh William, Stolovitzky Gustavo, Schadt Eric, Wang Xiaoyan

机构信息

Sema4, Stamford, CT, United States.

Lung Cancer Initiative, Johnson & Johnson, New Brunswick, NJ, United States.

出版信息

JMIR AI. 2023 Jun 1;2:e44537. doi: 10.2196/44537.

DOI:10.2196/44537

PMID:38875565

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11041451/

Abstract

BACKGROUND

Ground-glass opacities (GGOs) appearing in computed tomography (CT) scans may indicate potential lung malignancy. Proper management of GGOs based on their features can prevent the development of lung cancer. Electronic health records are rich sources of information on GGO nodules and their granular features, but most of the valuable information is embedded in unstructured clinical notes.

OBJECTIVE

We aimed to develop, test, and validate a deep learning-based natural language processing (NLP) tool that automatically extracts GGO features to inform the longitudinal trajectory of GGO status from large-scale radiology notes.

METHODS

We developed a bidirectional long short-term memory with a conditional random field-based deep-learning NLP pipeline to extract GGO and granular features of GGO retrospectively from radiology notes of 13,216 lung cancer patients. We evaluated the pipeline with quality assessments and analyzed cohort characterization of the distribution of nodule features longitudinally to assess changes in size and solidity over time.

RESULTS

Our NLP pipeline built on the GGO ontology we developed achieved between 95% and 100% precision, 89% and 100% recall, and 92% and 100% F-scores on different GGO features. We deployed this GGO NLP model to extract and structure comprehensive characteristics of GGOs from 29,496 radiology notes of 4521 lung cancer patients. Longitudinal analysis revealed that size increased in 16.8% (240/1424) of patients, decreased in 14.6% (208/1424), and remained unchanged in 68.5% (976/1424) in their last note compared to the first note. Among 1127 patients who had longitudinal radiology notes of GGO status, 815 (72.3%) were reported to have stable status, and 259 (23%) had increased/progressed status in the subsequent notes.

CONCLUSIONS

Our deep learning-based NLP pipeline can automatically extract granular GGO features at scale from electronic health records when this information is documented in radiology notes and help inform the natural history of GGO. This will open the way for a new paradigm in lung cancer prevention and early detection.

摘要

背景

计算机断层扫描（CT）中出现的磨玻璃影（GGO）可能提示潜在的肺恶性肿瘤。根据GGO的特征进行恰当管理可预防肺癌的发生。电子健康记录是有关GGO结节及其细微特征的丰富信息来源，但大多数有价值的信息都包含在非结构化的临床记录中。

目的

我们旨在开发、测试并验证一种基于深度学习的自然语言处理（NLP）工具，该工具可从大规模放射学记录中自动提取GGO特征，以了解GGO状态的纵向轨迹。

方法

我们开发了一种基于条件随机场的双向长短期记忆深度学习NLP管道，用于从13216例肺癌患者的放射学记录中回顾性提取GGO及其细微特征。我们通过质量评估对该管道进行评估，并纵向分析结节特征分布的队列特征，以评估大小和实性随时间的变化。

结果

我们基于所开发的GGO本体构建的NLP管道，在不同GGO特征上的精确率在95%至100%之间，召回率在89%至100%之间，F值在92%至100%之间。我们部署此GGO NLP模型，从4521例肺癌患者的29496份放射学记录中提取并构建GGO的综合特征。纵向分析显示，与首次记录相比，在最后一次记录中，16.8%（240/1424）的患者结节大小增加，14.6%（208/1424）的患者结节大小减小，68.5%（976/1424）的患者结节大小保持不变。在1127例有GGO状态纵向放射学记录的患者中，815例（72.3%）报告状态稳定，259例（23%）在后续记录中状态增加/进展。

结论

当电子健康记录中的放射学记录记录了这些信息时，我们基于深度学习的NLP管道可以大规模自动从电子健康记录中提取细微的GGO特征，并有助于了解GGO的自然史。这将为肺癌预防和早期检测的新范式开辟道路。

相似文献

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

JMIR AI. 2023 Jun 1;2:e44537. doi: 10.2196/44537.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma.

J Digit Imaging. 2023 Jun;36(3):812-826. doi: 10.1007/s10278-023-00787-z. Epub 2023 Feb 14.

Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data.

JMIR Form Res. 2023 Mar 7;7:e43014. doi: 10.2196/43014.

Ground glass opacity: can we correlate radiological and histological features to plan clinical decision making?

Gen Thorac Cardiovasc Surg. 2022 Nov;70(11):971-976. doi: 10.1007/s11748-022-01826-2. Epub 2022 May 7.

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.

J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.

Ground Glass Lesions on Chest Imaging: Evaluation of Reported Incidence in Cancer Patients Using Natural Language Processing.

Ann Thorac Surg. 2019 Mar;107(3):936-940. doi: 10.1016/j.athoracsur.2018.09.016. Epub 2018 Oct 26.

Optimizing Clinical Trial Eligibility Design Using Natural Language Processing Models and Real-World Data: Algorithm Development and Validation.

JMIR AI. 2024 Jul 29;3:e50800. doi: 10.2196/50800.

Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study.

JMIR Med Inform. 2024 Apr 3;12:e52289. doi: 10.2196/52289.

Development of a generalizable natural language processing pipeline to extract physician-reported pain from clinical reports: Generated using publicly-available datasets and tested on institutional clinical reports for cancer patients with bone metastases.

J Biomed Inform. 2021 Aug;120:103864. doi: 10.1016/j.jbi.2021.103864. Epub 2021 Jul 12.

引用本文的文献

Two stage large language model approach enhancing entity classification and relationship mapping in radiology reports.

Sci Rep. 2025 Aug 27;15(1):31550. doi: 10.1038/s41598-025-16213-z.

Decoding Recurrence in Early-Stage and Locoregionally Advanced Non-Small Cell Lung Cancer: Insights From Electronic Health Records and Natural Language Processing.

JCO Clin Cancer Inform. 2025 Apr;9:e2400227. doi: 10.1200/CCI-24-00227. Epub 2025 Apr 18.

Real-World Insights Into Dementia Diagnosis Trajectory and Clinical Practice Patterns Unveiled by Natural Language Processing: Development and Usability Study.

JMIR Aging. 2025 Feb 25;8:e65221. doi: 10.2196/65221.

Knowledge mapping analysis of ground glass nodules: a bibliometric analysis from 2013 to 2023.

Front Oncol. 2024 Sep 24;14:1469354. doi: 10.3389/fonc.2024.1469354. eCollection 2024.

Optimizing Clinical Trial Eligibility Design Using Natural Language Processing Models and Real-World Data: Algorithm Development and Validation.

JMIR AI. 2024 Jul 29;3:e50800. doi: 10.2196/50800.

本文引用的文献

Applications of natural language processing in radiology: A systematic review.

Int J Med Inform. 2022 Jul;163:104779. doi: 10.1016/j.ijmedinf.2022.104779. Epub 2022 Apr 26.

Using Natural Language Processing and Machine Learning to Preoperatively Predict Lymph Node Metastasis for Non-Small Cell Lung Cancer With Electronic Medical Records: Development and Validation Study.

JMIR Med Inform. 2022 Apr 25;10(4):e35475. doi: 10.2196/35475.

Automated Extraction of Pain Symptoms: A Natural Language Approach using Electronic Health Records.

Pain Physician. 2022 Mar;25(2):E245-E254.

Evaluating the Patient With a Pulmonary Nodule: A Review.

JAMA. 2022 Jan 18;327(3):264-273. doi: 10.1001/jama.2021.24287.

Lung cancer mortality reduction by LDCT screening: UKLS randomised trial results and international meta-analysis.

Lancet Reg Health Eur. 2021 Sep 11;10:100179. doi: 10.1016/j.lanepe.2021.100179. eCollection 2021 Nov.

Developing a RadLex-Based Named Entity Recognition Tool for Mining Textual Radiology Reports: Development and Performance Evaluation Study.

J Med Internet Res. 2021 Oct 29;23(10):e25378. doi: 10.2196/25378.

Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

J Biomed Inform. 2021 Nov;123:103915. doi: 10.1016/j.jbi.2021.103915. Epub 2021 Sep 29.

Multiomics Analysis Reveals Distinct Immunogenomic Features of Lung Cancer with Ground-Glass Opacity.

Am J Respir Crit Care Med. 2021 Nov 15;204(10):1180-1192. doi: 10.1164/rccm.202101-0119OC.

Radiomics-based machine learning differentiates "ground-glass" opacities due to COVID-19 from acute non-COVID-19 lung disease.

Sci Rep. 2021 Aug 26;11(1):17237. doi: 10.1038/s41598-021-96755-0.

Natural Language Processing to Identify Pulmonary Nodules and Extract Nodule Characteristics From Radiology Reports.

Chest. 2021 Nov;160(5):1902-1914. doi: 10.1016/j.chest.2021.05.048. Epub 2021 Jun 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

检测肺癌患者的磨玻璃影特征：基于深度学习的自然语言处理实现自动提取与纵向分析

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

检测肺癌患者的磨玻璃影特征：基于深度学习的自然语言处理实现自动提取与纵向分析

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

检测肺癌患者的磨玻璃影特征：基于深度学习的自然语言处理实现自动提取与纵向分析

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献