预测肺癌未来风险：CanPredict（肺部）模型在 1967 万人中的开发、内部和外部验证以及该模型与其他七个风险预测模型的性能评估。

Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19·67 million people and evaluation of model performance against seven other risk prediction models.

机构信息

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK; School of Medicine, University of Nottingham, Nottingham, UK.

出版信息

Lancet Respir Med. 2023 Aug;11(8):685-697. doi: 10.1016/S2213-2600(23)00050-4. Epub 2023 Apr 5.

BACKGROUND

Lung cancer is the second most common cancer in incidence and the leading cause of cancer deaths worldwide. Meanwhile, lung cancer screening with low-dose CT can reduce mortality. The UK National Screening Committee recommended targeted lung cancer screening on Sept 29, 2022, and asked for more modelling work to be done to help refine the recommendation. This study aims to develop and validate a risk prediction model-the CanPredict (lung) model-for lung cancer screening in the UK and compare the model performance against seven other risk prediction models.

METHODS

For this retrospective, population-based, cohort study, we used linked electronic health records from two English primary care databases: QResearch (Jan 1, 2005-March 31, 2020) and Clinical Practice Research Datalink (CPRD) Gold (Jan 1, 2004-Jan 1, 2015). The primary study outcome was an incident diagnosis of lung cancer. We used a Cox proportional-hazards model in the derivation cohort (12·99 million individuals aged 25-84 years from the QResearch database) to develop the CanPredict (lung) model in men and women. We used discrimination measures (Harrell's C statistic, D statistic, and the explained variation in time to diagnosis of lung cancer [R]) and calibration plots to evaluate model performance by sex and ethnicity, using data from QResearch (4·14 million people for internal validation) and CPRD (2·54 million for external validation). Seven models for predicting lung cancer risk (Liverpool Lung Project [LLP], LLP, Lung Cancer Risk Assessment Tool [LCRAT], Prostate, Lung, Colorectal, and Ovarian [PLCO], PLCO, Pittsburgh, and Bach) were selected to compare their model performance with the CanPredict (lung) model using two approaches: (1) in ever-smokers aged 55-74 years (the population recommended for lung cancer screening in the UK), and (2) in the populations for each model determined by that model's eligibility criteria.

FINDINGS

There were 73 380 incident lung cancer cases in the QResearch derivation cohort, 22 838 cases in the QResearch internal validation cohort, and 16 145 cases in the CPRD external validation cohort during follow-up. The predictors in the final model included sociodemographic characteristics (age, sex, ethnicity, Townsend score), lifestyle factors (BMI, smoking and alcohol status), comorbidities, family history of lung cancer, and personal history of other cancers. Some predictors were different between the models for women and men, but model performance was similar between sexes. The CanPredict (lung) model showed excellent discrimination and calibration in both internal and external validation of the full model, by sex and ethnicity. The model explained 65% of the variation in time to diagnosis of lung cancer R in both sexes in the QResearch validation cohort and 59% of the R in both sexes in the CPRD validation cohort. Harrell's C statistics were 0·90 in the QResearch (validation) cohort and 0·87 in the CPRD cohort, and the D statistics were 2·8 in the QResearch (validation) cohort and 2·4 in the CPRD cohort. Compared with seven other lung cancer prediction models, the CanPredict (lung) model had the best performance in discrimination, calibration, and net benefit across three prediction horizons (5, 6, and 10 years) in the two approaches. The CanPredict (lung) model also had higher sensitivity than the current UK recommended models (LLP and PLCO), as it identified more lung cancer cases than those models by screening the same amount of individuals at high risk.

INTERPRETATION

The CanPredict (lung) model was developed, and internally and externally validated, using data from 19·67 million people from two English primary care databases. Our model has potential utility for risk stratification of the UK primary care population and selection of individuals at high risk of lung cancer for targeted screening. If our model is recommended to be implemented in primary care, each individual's risk can be calculated using information in the primary care electronic health records, and people at high risk can be identified for the lung cancer screening programme.

FUNDING

Innovate UK (UK Research and Innovation).

TRANSLATION

For the Chinese translation of the abstract see Supplementary Materials section.

背景

肺癌是全球第二大常见癌症，也是癌症死亡的主要原因。同时，低剂量 CT 肺癌筛查可以降低死亡率。英国国家筛查委员会于 2022 年 9 月 29 日建议进行有针对性的肺癌筛查，并要求进行更多的建模工作以帮助完善该建议。本研究旨在开发和验证用于英国肺癌筛查的风险预测模型——CanPredict（lung）模型，并与其他七个风险预测模型进行比较。

方法

在这项回顾性、基于人群的队列研究中，我们使用了来自两个英国初级保健数据库的电子健康记录：QResearch（2005 年 1 月 1 日至 2020 年 3 月 31 日）和 Clinical Practice Research Datalink（CPRD）Gold（2004 年 1 月 1 日至 2015 年 1 月 1 日）。主要研究结果是诊断为肺癌的病例。我们使用 Cox 比例风险模型在 QResearch 数据库中（1299 万 25-84 岁人群）进行男性和女性的 CanPredict（lung）模型的推导。我们使用了区分度指标（Harrell's C 统计量、D 统计量和对肺癌诊断时间的解释变异量[R]）和校准图，使用 QResearch（414 万人用于内部验证）和 CPRD（254 万人用于外部验证）的数据来评估性别和种族的模型性能。选择了七个用于预测肺癌风险的模型（Liverpool Lung Project [LLP]、LLP、Lung Cancer Risk Assessment Tool [LCRAT]、Prostate，Lung，Colorectal，and Ovarian [PLCO]、PLCO、Pittsburgh 和 Bach），通过两种方法来比较它们与 CanPredict（lung）模型的性能：（1）在年龄为 55-74 岁的吸烟人群中（英国推荐进行肺癌筛查的人群），以及（2）在每个模型的合格标准所确定的人群中。

结果

在 QResearch 推导队列中，有 73380 例肺癌确诊病例，在 QResearch 内部验证队列中有 22838 例，在 CPRD 外部验证队列中有 16145 例在随访中发生。最终模型中的预测因素包括社会人口统计学特征（年龄、性别、种族、汤森德评分）、生活方式因素（BMI、吸烟和饮酒状况）、合并症、肺癌家族史和其他癌症病史。一些预测因素在男女模型中有所不同，但性别间的模型性能相似。在男女内部和外部验证中，全模型的 CanPredict（lung）模型都表现出极好的区分度和校准度。该模型在 QResearch 验证队列中解释了肺癌诊断时间 R 的 65%的变异量，在 CPRD 验证队列中解释了 R 的 59%的变异量。在 QResearch（验证）队列中，Harrell's C 统计量为 0.90，在 CPRD 队列中为 0.87，D 统计量在 QResearch（验证）队列中为 2.8，在 CPRD 队列中为 2.4。与其他七个肺癌预测模型相比，在两种方法的三个预测时间范围（5、6 和 10 年）中，CanPredict（lung）模型在区分度、校准度和净效益方面的性能最佳。CanPredict（lung）模型的灵敏度也高于当前英国推荐的模型（LLP 和 PLCO），因为它通过对高风险人群进行筛查，比这些模型识别出了更多的肺癌病例。

结论

该模型使用来自两个英国初级保健数据库的 1967 万人的数据进行了开发和内部验证。我们的模型对英国初级保健人群的风险分层和选择高危人群进行有针对性的筛查具有潜在的应用价值。如果我们的模型被推荐在初级保健中实施，那么可以使用初级保健电子健康记录中的信息来计算每个人的风险，并且可以识别出高风险人群进行肺癌筛查计划。

资助

英国创新署（英国研究与创新署）。

Suppr
超能文献

Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19·67 million people and evaluation of model performance against seven other risk prediction models.

机构信息

出版信息

BACKGROUND

METHODS

FINDINGS

INTERPRETATION

FUNDING

TRANSLATION

背景

方法

结果

结论

资助

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

Suppr超能文献

预测肺癌未来风险：CanPredict（肺部）模型在 1967 万人中的开发、内部和外部验证以及该模型与其他七个风险预测模型的性能评估。

Predicting the future risk of lung cancer: development, and internal and external validation of the CanPredict (lung) model in 19·67 million people and evaluation of model performance against seven other risk prediction models.

机构信息

出版信息

BACKGROUND

METHODS

FINDINGS

INTERPRETATION

FUNDING

TRANSLATION

背景

方法

结果

结论

资助

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

Suppr
超能文献