慢性阻塞性肺疾病基因（COPDGene®）研究中一秒用力呼气容积进展的机器学习预测

Machine Learning Prediction of Progression in Forced Expiratory Volume in 1 Second in the COPDGene® Study.

作者信息

Boueiz Adel, Xu Zhonghui, Chang Yale, Masoomi Aria, Gregory Andrew, Lutz Sharon M, Qiao Dandi, Crapo James D, Dy Jennifer G, Silverman Edwin K, Castaldi Peter J

机构信息

Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States.

Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States.

出版信息

Chronic Obstr Pulm Dis. 2022 Jul 29;9(3):349-365. doi: 10.15326/jcopdf.2021.0275.

DOI:10.15326/jcopdf.2021.0275

PMID:35649102

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9448009/

Abstract

BACKGROUND

The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features.

METHODS

We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit.

RESULTS

Predicting the change in FEV over time is more challenging than simply predicting the future absolute FEV level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD.

CONCLUSION

Random forest, along with deep phenotyping, predicts FEV progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials.

摘要

背景

慢性阻塞性肺疾病（COPD）的异质性使得疾病进展预测指标的识别变得复杂。我们旨在通过使用机器学习并纳入丰富的表型特征数据集来改善COPD疾病进展的预测。

方法

我们纳入了慢性阻塞性肺疾病基因流行病学（COPDGene）研究中4496名有入组数据和5年随访数据的吸烟者。我们构建了线性回归（LR）模型和监督随机森林模型，以根据46个基线特征预测1秒用力呼气容积（FEV）的5年进展情况。通过交叉验证，我们将参与者随机分为训练样本和测试样本。我们还在COPDGene研究的10年随访中验证了结果。

结果

预测FEV随时间的变化比简单预测未来的绝对FEV水平更具挑战性。对于随机森林模型，决定系数R²为0.15，预测观察到的进展处于前四分位数的参与者时，受试者工作特征（ROC）曲线下面积在测试集中为0.71，在验证集中分别为0.10和0.70。随机森林模型的表现略优于LR模型。对于慢性阻塞性肺疾病全球倡议（GOLD）1-2级参与者，预测准确性最佳，而在疾病晚期则更难实现准确预测。预测变量的相对重要性以及按GOLD分级的预测结果各不相同。

结论

随机森林模型结合深度表型分析，能以合理的准确性预测FEV进展情况。未来模型仍有很大改进空间。该预测模型有助于识别疾病快速进展风险增加的吸烟者。这些发现可能有助于选择适合进行靶向临床试验的患者群体。

相似文献

Machine Learning Prediction of Progression in Forced Expiratory Volume in 1 Second in the COPDGene® Study.慢性阻塞性肺疾病基因（COPDGene®）研究中一秒用力呼气容积进展的机器学习预测

Chronic Obstr Pulm Dis. 2022 Jul 29;9(3):349-365. doi: 10.15326/jcopdf.2021.0275.

Machine learning for screening of at-risk, mild and moderate COPD patients at risk of FEV decline: results from COPDGene and SPIROMICS.用于筛查有FEV下降风险的高危、轻度和中度慢性阻塞性肺疾病（COPD）患者的机器学习：来自慢性阻塞性肺疾病基因研究（COPDGene）和慢性阻塞性肺疾病生物标志物研究（SPIROMICS）的结果

Front Physiol. 2023 Apr 21;14:1144192. doi: 10.3389/fphys.2023.1144192. eCollection 2023.

Five-year Progression of Emphysema and Air Trapping at CT in Smokers with and Those without Chronic Obstructive Pulmonary Disease: Results from the COPDGene Study.吸烟者中存在和不存在慢性阻塞性肺疾病（COPD）者的 CT 肺气肿和空气潴留的 5 年进展：来自 COPDGene 研究的结果。

Radiology. 2020 Apr;295(1):218-226. doi: 10.1148/radiol.2020191429. Epub 2020 Feb 4.

Machine Learning and Prediction of All-Cause Mortality in COPD.机器学习与 COPD 全因死亡率预测。

Chest. 2020 Sep;158(3):952-964. doi: 10.1016/j.chest.2020.02.079. Epub 2020 Apr 27.

A Risk Prediction Model for Mortality Among Smokers in the COPDGene® Study.COPDGene®研究中吸烟者死亡率的风险预测模型

Chronic Obstr Pulm Dis. 2020 Oct;7(4):346-361. doi: 10.15326/jcopdf.7.4.2020.0146.

Acute Exacerbation of a Chronic Obstructive Pulmonary Disease Prediction System Using Wearable Device Data, Machine Learning, and Deep Learning: Development and Cohort Study.使用可穿戴设备数据、机器学习和深度学习的慢性阻塞性肺病急性加重预测系统：开发和队列研究。

JMIR Mhealth Uhealth. 2021 May 6;9(5):e22591. doi: 10.2196/22591.

Prediction of postoperative cardiopulmonary complications after lung resection in a Chinese population: A machine learning-based study.中国人群肺切除术后心肺并发症的预测：一项基于机器学习的研究。

Front Oncol. 2022 Sep 23;12:1003722. doi: 10.3389/fonc.2022.1003722. eCollection 2022.

[Construction and verification of the risk prediction model for acute exacerbation within 6 months in patients with chronic obstructive pulmonary disease: a secondary analysis based on previous research data].[慢性阻塞性肺疾病患者6个月内急性加重风险预测模型的构建与验证：基于既往研究数据的二次分析]

Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2022 Apr;34(4):373-377. doi: 10.3760/cma.j.cn121430-20210929-01414.

Acute Exacerbations and Lung Function Loss in Smokers with and without Chronic Obstructive Pulmonary Disease.患有和未患有慢性阻塞性肺疾病的吸烟者的急性加重期及肺功能丧失

Am J Respir Crit Care Med. 2017 Feb 1;195(3):324-330. doi: 10.1164/rccm.201605-1014OC.

引用本文的文献

Leveraging Subjective Parameters and Biomarkers in Machine Learning Models: The Feasibility of for Managing Emphysema Progression.在机器学习模型中利用主观参数和生物标志物：管理肺气肿进展的可行性

Diagnostics (Basel). 2025 May 3;15(9):1165. doi: 10.3390/diagnostics15091165.

Estimating rate of lung function change using clinical spirometry data.使用临床肺功能测定数据估计肺功能变化率。

BMJ Open Respir Res. 2024 Oct 3;11(1):e001896. doi: 10.1136/bmjresp-2023-001896.

Identification of factors directly linked to incident chronic obstructive pulmonary disease: A causal graph modeling study.鉴定与新发慢性阻塞性肺疾病直接相关的因素：因果图建模研究。

PLoS Med. 2024 Aug 13;21(8):e1004444. doi: 10.1371/journal.pmed.1004444. eCollection 2024 Aug.

Front Physiol. 2023 Apr 21;14:1144192. doi: 10.3389/fphys.2023.1144192. eCollection 2023.

Predicting COPD Progression in Current and Former Smokers Using a Joint Model for Forced Expiratory Volume in 1 Second and Forced Expiratory Volume in 1 Second to Forced Vital Capacity Ratio.使用一秒用力呼气容积和一秒用力呼气容积与用力肺活量比值的联合模型预测当前吸烟者和既往吸烟者的慢性阻塞性肺疾病进展情况。

Chronic Obstr Pulm Dis. 2022 Jul 29;9(3):439-453. doi: 10.15326/jcopdf.2022.0281.

本文引用的文献

Deaths: Leading Causes for 2018.死亡：2018 年的主要死因。

Natl Vital Stat Rep. 2021 May;70(4):1-115.

Random forests for high-dimensional longitudinal data.随机森林在高维纵向数据中的应用。

Stat Methods Med Res. 2021 Jan;30(1):166-184. doi: 10.1177/0962280220946080. Epub 2020 Aug 9.

Conditional permutation importance revisited.条件排列重要性再探。

BMC Bioinformatics. 2020 Jul 14;21(1):307. doi: 10.1186/s12859-020-03622-2.

BMI is associated with FEV decline in chronic obstructive pulmonary disease: a meta-analysis of clinical trials.BMI 与慢性阻塞性肺疾病的 FEV 下降相关：临床试验的荟萃分析。

Respir Res. 2019 Oct 29;20(1):236. doi: 10.1186/s12931-019-1209-5.

Prognostic models for outcome prediction in patients with chronic obstructive pulmonary disease: systematic review and critical appraisal.慢性阻塞性肺疾病患者结局预测的预后模型：系统评价和批判性评估。

BMJ. 2019 Oct 4;367:l5358. doi: 10.1136/bmj.l5358.

An Individualized Prediction Model for Long-term Lung Function Trajectory and Risk of COPD in the General Population.个体化预测模型用于预测一般人群的长期肺功能轨迹和 COPD 风险。

Chest. 2020 Mar;157(3):547-557. doi: 10.1016/j.chest.2019.09.003. Epub 2019 Sep 19.

Prediction models for the development of COPD: a systematic review.慢性阻塞性肺疾病（COPD）发生发展的预测模型：一项系统综述

Int J Chron Obstruct Pulmon Dis. 2018 Jun 14;13:1927-1935. doi: 10.2147/COPD.S155675. eCollection 2018.

The State of US Health, 1990-2016: Burden of Diseases, Injuries, and Risk Factors Among US States.《1990 - 2016年美国健康状况：美国各州的疾病、伤害及风险因素负担》

JAMA. 2018 Apr 10;319(14):1444-1472. doi: 10.1001/jama.2018.0158.

Global Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Lung Disease 2017 Report: GOLD Executive Summary.全球慢性阻塞性肺疾病诊断、管理和预防策略 2017 年报告：GOLD 执行摘要。

Arch Bronconeumol. 2017 Mar;53(3):128-149. doi: 10.1016/j.arbres.2017.02.001. Epub 2017 Mar 6.

Early-Life Origins of Chronic Obstructive Pulmonary Disease.慢性阻塞性肺疾病的早期生活起源

N Engl J Med. 2016 Sep 1;375(9):871-8. doi: 10.1056/NEJMra1603287.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验