Department of Health Science and Technology, Aalborg University, Aalborg, Denmark.
Steno Diabetes Center North Denmark, Aalborg, Denmark.
Clin Respir J. 2023 Aug;17(8):819-828. doi: 10.1111/crj.13662. Epub 2023 Jul 13.
Spirometry is associated with several diagnostic difficulties, and as a result, misdiagnosis of chronic obstructive pulmonary disease (COPD) occurs. This study aims to investigate how random forest (RF) can be used to improve the existing clinical FVC and FEV1 reference values in a large and representative cohort of the general population of the US without known lung disease.
FVC, FEV1, body measures, and demographic data from 23 433 people were extracted from NHANES. RF was used to develop different prediction models. The accuracy of RF was compared with the existing Danish clinical references, an improved multiple linear regression (MLR) model, and a model from the literature.
The correlation between actual and predicted FVC and FEV1 and the 95% confidence interval for RF were found to be FVC = 0.85 (0.85; 0.86) (p < 0.001), FEV1 = 0.92 (0.92; 0.93) (p < 0.001), and existing clinical references were FVC = 0.66 (0.64; 0.68) (p < 0.001) and FEV1 = 0.69 (0.67; 0.70) (p < 0.001). Slope and intercept for the RF models predicting FVC and FEV1 were FVC 1.06 and -238.04 (mL), FEV1: 0.86 and 455.36 (mL), and for the MLR models, slope and intercept were FVC: 0.99 and 38.56 39 (mL), and FEV1: 1.01 and -56.57-57 (mL).
The results point toward machine learning models such as RF have the potential to improve the prediction of estimated lung function for individual patients. These predictions are used as reference values and are an important part of assessing spirometry measurements in clinical practice. Further work is necessary in order to reduce the size of the intercepts obtained through these results.
肺量测定法与一些诊断困难有关,因此,慢性阻塞性肺疾病(COPD)的误诊时有发生。本研究旨在探讨随机森林(RF)如何应用于改善美国无已知肺部疾病的普通人群的大型代表性队列中的现有临床 FVC 和 FEV1 参考值。
从 NHANES 中提取了 23433 人的 FVC、FEV1、身体测量值和人口统计学数据。RF 用于开发不同的预测模型。比较了 RF 的准确性与现有的丹麦临床参考值、改进的多元线性回归(MLR)模型和文献中的模型。
发现 RF 实际和预测 FVC 和 FEV1 之间的相关性以及 95%置信区间为 FVC=0.85(0.85;0.86)(p<0.001),FEV1=0.92(0.92;0.93)(p<0.001),而现有的临床参考值为 FVC=0.66(0.64;0.68)(p<0.001)和 FEV1=0.69(0.67;0.70)(p<0.001)。预测 FVC 和 FEV1 的 RF 模型的斜率和截距分别为 FVC 1.06 和-238.04(mL),FEV1:0.86 和 455.36(mL),而 MLR 模型的斜率和截距分别为 FVC:0.99 和 38.56-39(mL),FEV1:1.01 和-56.57-57(mL)。
结果表明,机器学习模型(如 RF)有可能改善个体患者估计肺功能的预测。这些预测值被用作参考值,是评估临床实践中肺量测定测量值的重要组成部分。需要进一步的工作来减少通过这些结果获得的截距的大小。