利用人工智能通过全科医生临床记录早期检测肺癌：一项回顾性观察队列研究

Artificial intelligence for early detection of lung cancer in GPs' clinical notes: a retrospective observational cohort study.

作者信息

Schut Martijn C, Luik Torec T, Vagliano Iacopo, Rios Miguel, Helsper Charles W, van Asselt Kristel M, de Wit Niek, Abu-Hanna Ameen, van Weert Henk Cpm

机构信息

Department of Laboratory Medicine, Amsterdam University Medical Center (UMC) Vrije Universiteit Amsterdam, Amsterdam; Amsterdam Public Health, Amsterdam UMC, Amsterdam, the Netherlands.

Department of Medical Informatics, Amsterdam UMC Academic Medical Center (AMC), Amsterdam; Amsterdam Public Health, Amsterdam UMC, Amsterdam; Department of Medical Biology, Amsterdam UMC AMC, Amsterdam, the Netherlands.

出版信息

Br J Gen Pract. 2025 May 2;75(754):e316-e322. doi: 10.3399/BJGP.2023.0489. Print 2025 May.

DOI:10.3399/BJGP.2023.0489

PMID:40044183

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12040367/

Abstract

BACKGROUND

The journey of >80% of patients diagnosed with lung cancer starts in general practice. About 75% of patients are diagnosed when it is at an advanced stage (3 or 4), leading to >80% mortality within 1 year at present. The long-term data in GP records might contain hidden information that could be used for earlier case finding of patients with cancer.

AIM

To develop new prediction tools that improve the risk assessment for lung cancer.

DESIGN AND SETTING

Text analysis of electronic patient data using natural language processing and machine learning in the general practice files of four networks in the Netherlands.

METHOD

Files of 525 526 patients were analysed, of whom 2386 were diagnosed with lung cancer. Diagnoses were validated by using the Dutch cancer registry, and both structured and free-text data were used to predict the diagnosis of lung cancer 5 months before diagnosis (4 months before referral).

RESULTS

The algorithm could facilitate earlier detection of lung cancer using routine general practice data. Discrimination, calibration, sensitivity, and specificity were established under various cut-off points of the prediction 5 months before diagnosis. Internal validation of the best model demonstrated an area under the curve of 0.88 (95% confidence interval [CI] = 0.86 to 0.89), which shrunk to 0.79 (95% CI = 0.78 to 0.80) during external validation. The desired sensitivity determines the number of patients to be referred to detect one patient with lung cancer.

CONCLUSION

Artificial intelligence-based support enables earlier detection of lung cancer in general practice using readily available text in the patient files of GPs, but needs additional prospective clinical evaluation.

摘要

背景

超过80%被诊断为肺癌的患者病程始于全科医疗。约75%的患者在晚期（3期或4期）被诊断出来，目前导致1年内死亡率超过80%。全科医疗记录中的长期数据可能包含可用于早期发现癌症患者的隐藏信息。

目的

开发新的预测工具以改善肺癌风险评估。

设计与设置

在荷兰四个网络的全科医疗档案中，使用自然语言处理和机器学习对电子患者数据进行文本分析。

方法

分析了525526名患者的档案，其中2386人被诊断为肺癌。通过荷兰癌症登记处对诊断进行验证，并使用结构化和自由文本数据来预测诊断前5个月（转诊前4个月）的肺癌情况。

结果

该算法可利用常规全科医疗数据促进肺癌的早期检测。在诊断前5个月的预测不同临界点下确定了区分度、校准度、敏感性和特异性。最佳模型的内部验证显示曲线下面积为0.88（95%置信区间[CI]=0.86至0.89），外部验证期间缩小至0.79（95%CI=0.78至0.80）。所需的敏感性决定了为检测出一名肺癌患者而需转诊的患者数量。

结论

基于人工智能的支持能够利用全科医生患者档案中现成的文本在全科医疗中早期检测肺癌，但需要额外的前瞻性临床评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2e4/12040367/5669a65e3d28/bjgpmay-2025-75-754-e316-1.jpg

相似文献

Artificial intelligence for early detection of lung cancer in GPs' clinical notes: a retrospective observational cohort study.利用人工智能通过全科医生临床记录早期检测肺癌：一项回顾性观察队列研究

Br J Gen Pract. 2025 May 2;75(754):e316-e322. doi: 10.3399/BJGP.2023.0489. Print 2025 May.

The effect of direct referral for fast CT scan in early lung cancer detection in general practice. A clinical, cluster-randomised trial.在全科医疗中，直接转诊进行快速CT扫描对早期肺癌检测的效果。一项临床、整群随机试验。

Dan Med J. 2015 Mar;62(3).

Natural language processing of admission notes to predict severe maternal morbidity during the delivery encounter.入院记录的自然语言处理预测分娩时严重产妇发病率。

Am J Obstet Gynecol. 2022 Sep;227(3):511.e1-511.e8. doi: 10.1016/j.ajog.2022.04.008. Epub 2022 Apr 14.

Artificial intelligence software for analysing chest X-ray images to identify suspected lung cancer: an evidence synthesis early value assessment.人工智能软件分析胸部 X 光图像以识别疑似肺癌：证据综合早期价值评估。

Health Technol Assess. 2024 Aug;28(50):1-75. doi: 10.3310/LKRT4721.

Predicting anticipated benefit from an extended consultation to personalise care in multimorbidity: a development and internal validation study of a prioritisation algorithm in general practice.预测延长会诊以实现多病共存个性化护理的预期获益：一项全科医疗中优先排序算法的开发与内部验证研究

Br J Gen Pract. 2024 Apr 25;74(742):e307-e314. doi: 10.3399/BJGP.2023.0114. Print 2024 May.

A Natural Language Processing Model for COVID-19 Detection Based on Dutch General Practice Electronic Health Records by Using Bidirectional Encoder Representations From Transformers: Development and Validation Study.基于荷兰全科电子健康记录的 COVID-19 检测自然语言处理模型：使用转换器的双向编码器表示进行开发和验证研究。

J Med Internet Res. 2023 Oct 4;25:e49944. doi: 10.2196/49944.

Machine Learning for Early Lung Cancer Identification Using Routine Clinical and Laboratory Data.基于常规临床和实验室数据的肺癌早期识别的机器学习。

Am J Respir Crit Care Med. 2021 Aug 15;204(4):445-453. doi: 10.1164/rccm.202007-2791OC.

Highly sensitive detection platform-based diagnosis of oesophageal squamous cell carcinoma in China: a multicentre, case-control, diagnostic study.基于高灵敏度检测平台的中国食管鳞状细胞癌诊断：一项多中心、病例对照诊断研究。

Lancet Digit Health. 2024 Oct;6(10):e705-e717. doi: 10.1016/S2589-7500(24)00153-5.

Can We Geographically Validate a Natural Language Processing Algorithm for Automated Detection of Incidental Durotomy Across Three Independent Cohorts From Two Continents?能否通过来自两大洲的三个独立队列对用于自动检测偶然硬脊膜切开术的自然语言处理算法进行地理验证？

Clin Orthop Relat Res. 2022 Sep 1;480(9):1766-1775. doi: 10.1097/CORR.0000000000002200. Epub 2022 Apr 12.

Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理（NLP）方法预测激素受体阳性（HR+）/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。

Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.

引用本文的文献

Generative artificial intelligence for general practice; new potential ahead, but are we ready?用于全科医疗的生成式人工智能：前景可期，但我们准备好了吗？

Eur J Gen Pract. 2025 Dec;31(1):2511645. doi: 10.1080/13814788.2025.2511645. Epub 2025 Jun 6.

An Order-Sensitive Hierarchical Neural Model for Early Lung Cancer Detection Using Dutch Primary Care Notes and Structured Data.一种使用荷兰初级保健记录和结构化数据进行早期肺癌检测的顺序敏感分层神经模型。

Cancers (Basel). 2025 Mar 29;17(7):1151. doi: 10.3390/cancers17071151.

本文引用的文献

Prognostic models of in-hospital mortality of intensive care patients using neural representation of unstructured text: A systematic review and critical appraisal.使用非结构化文本的神经表示法对重症监护患者院内死亡率的预测模型：一项系统综述与批判性评价

J Biomed Inform. 2023 Oct;146:104504. doi: 10.1016/j.jbi.2023.104504. Epub 2023 Sep 22.

Early detection of colorectal cancer by leveraging Dutch primary care consultation notes with free text embeddings.利用具有自由文本嵌入功能的荷兰初级保健咨询记录进行结直肠癌的早期检测。

Sci Rep. 2023 Jul 4;13(1):10760. doi: 10.1038/s41598-023-37397-2.

Developing and Validating a Lung Cancer Risk Prediction Model: A Nationwide Population-Based Study.开发并验证肺癌风险预测模型：一项基于全国人口的研究。

Cancers (Basel). 2023 Jan 12;15(2):487. doi: 10.3390/cancers15020487.

Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset.深度学习模型在 MIMIC-IV 数据集上的可解释性和公平性评估。

Sci Rep. 2022 May 3;12(1):7166. doi: 10.1038/s41598-022-11012-2.

A prospective cohort evaluation of the sensitivity and specificity of the chest X-ray for the detection of lung cancer in symptomatic adults.一项针对有症状成年人的胸部 X 光检查在肺癌检测中的敏感性和特异性的前瞻性队列评估。

Eur J Radiol. 2021 Nov;144:109953. doi: 10.1016/j.ejrad.2021.109953. Epub 2021 Sep 20.

Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review.人工智能技术在初级保健数据中的应用，以促进癌症的早期诊断：系统综述。

J Med Internet Res. 2021 Mar 3;23(3):e23483. doi: 10.2196/23483.

Mortality due to cancer treatment delay: systematic review and meta-analysis.癌症治疗延迟导致的死亡率：系统评价与荟萃分析

BMJ. 2020 Nov 4;371:m4087. doi: 10.1136/bmj.m4087.

Effect of delays in the 2-week-wait cancer referral pathway during the COVID-19 pandemic on cancer survival in the UK: a modelling study.新冠大流行期间 2 周候诊癌症转诊通道延迟对英国癌症生存率的影响：一项建模研究。

Lancet Oncol. 2020 Aug;21(8):1035-1044. doi: 10.1016/S1470-2045(20)30392-2. Epub 2020 Jul 20.

Presenting symptoms of cancer and stage at diagnosis: evidence from a cross-sectional, population-based study.诊断时癌症的表现症状和分期：一项基于人群的横断面研究证据。

Lancet Oncol. 2020 Jan;21(1):73-79. doi: 10.1016/S1470-2045(19)30595-9. Epub 2019 Nov 6.

A multi-parameterized artificial neural network for lung cancer risk prediction.用于肺癌风险预测的多参数人工神经网络。

PLoS One. 2018 Oct 24;13(10):e0205264. doi: 10.1371/journal.pone.0205264. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用人工智能通过全科医生临床记录早期检测肺癌：一项回顾性观察队列研究

Artificial intelligence for early detection of lung cancer in GPs' clinical notes: a retrospective observational cohort study.

作者信息

机构信息

出版信息

BACKGROUND

AIM

DESIGN AND SETTING

METHOD

RESULTS

CONCLUSION

背景

目的

设计与设置

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献