Xia Tiansheng, Han Kaiyu
Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Harbin Medical University, 246 Xuefu Road, Nangang District, Harbin, 150001, China.
BMC Pulm Med. 2025 May 22;25(1):252. doi: 10.1186/s12890-025-03724-8.
Chronic bronchitis (CB), as a core precursor of Chronic Obstructive Pulmonary Disease (COPD), is crucial for global disease burden prevention and control. Although the association between heavy metal exposure and respiratory damage has been preliminarily demonstrated, traditional linear models are difficult to resolve the nonlinear interactions and dose-response heterogeneity. The aim of this study was to construct the first heavy metal exposure-chronic bronchitis risk prediction model by integrating exposureomics data through machine learning (ML).
Weighted logistic regression was used to assess the association of 14 blood and urine heavy metals with CB based on nationally representative samples from the 2005-2015 National Health and Nutrition Examination Survey (NHANES). The Boruta algorithm was further applied to screen the characteristic variables and construct 10 ML models. The best model was selected by four evaluation metrics: accuracy, specificity, sensitivity, and area under the ROC curve (AUC), and the best model was visually interpreted using Shapley's additive interpretation (SHAP).
The multifactorial logistic regression model showed that urinary cadmium (OR = 1.53, 95% CI = 1.17-1.98) versus blood cadmium (OR = 1.36, 1.13-1.65) was an independent risk factor for CB. The CatBoost model had the best predictive performance (AUC = 0.805), with smoking as the most significant predictor, followed by blood cadmium concentration and gender.
In this research, the first risk prediction diagnostic model for heavy metal-chronic bronchitis was developed, in which CatBoost model had the best performance, and it provides a referenceable prediction model for the screening of high-risk groups.
慢性支气管炎(CB)作为慢性阻塞性肺疾病(COPD)的核心前驱疾病,对于全球疾病负担的预防和控制至关重要。尽管重金属暴露与呼吸道损伤之间的关联已得到初步证实,但传统线性模型难以解决非线性相互作用和剂量反应异质性问题。本研究旨在通过机器学习(ML)整合暴露组学数据,构建首个重金属暴露-慢性支气管炎风险预测模型。
基于2005 - 2015年美国国家健康与营养检查调查(NHANES)具有全国代表性的样本,采用加权逻辑回归评估14种血液和尿液重金属与CB的关联。进一步应用Boruta算法筛选特征变量并构建10个ML模型。通过准确性、特异性、敏感性和ROC曲线下面积(AUC)这四个评估指标选择最佳模型,并使用Shapley加法解释(SHAP)对最佳模型进行可视化解释。
多因素逻辑回归模型显示,尿镉(OR = 1.53,95%CI = 1.17 - 1.98)与血镉(OR = 1.36,1.13 - 1.65)相比,是CB的独立危险因素。CatBoost模型具有最佳预测性能(AUC = 0.805),其中吸烟是最显著的预测因素,其次是血镉浓度和性别。
本研究开发了首个重金属-慢性支气管炎风险预测诊断模型,其中CatBoost模型性能最佳,为高危人群筛查提供了可参考的预测模型。