Jiang Liyan, Wang Hongling, Xiao Yang, Xu Linlin, Chen Huoying
Department of Laboratory Medicine, The Second Affiliated Hospital of Guilin Medical University, Guilin, China.
School of Clinical Medicine, Guilin Medical University, Guilin, China.
Ren Fail. 2025 Dec;47(1):2520906. doi: 10.1080/0886022X.2025.2520906. Epub 2025 Jun 23.
Chronic Kidney Disease (CKD) affects approximately 697.5 million people worldwide. Volatile organic compounds (VOCs) are emerging as potential risk factors, but their complex relationships with CKD may be underestimated by traditional linear methods. This study explores the association between urinary VOC metabolites and CKD risk using a combination of epidemiological and interpretable machine learning approaches.
Data from the National Health and Nutrition Examination Survey (2011-March 2020 pre-pandemic) were analyzed to examine 15 urinary VOC metabolites. Analytical methods included multivariable logistic regression, LASSO regression, and five machine learning models: Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), and Multilayer Perceptron (MLP). SHapley Additive exPlanations (SHAP) analysis was used to enhance model interpretability.
Significant associations were observed for metabolites including CEMA (N-Acetyl-S-(2-carboxyethyl)-L-cysteine) (OR = 1.66, 95% CI: 1.17-2.37), DHBMA (N-Acetyl-S-(3,4-dihydroxybutyl)-L-cysteine) (OR = 1.95, 95% CI: 1.38-2.76), HMPMA (N-Acetyl-S-(3-hydroxypropyl-1-methyl)-L-cysteine) (OR = 2.18, 95% CI: 1.53-3.10), and PGA (Phenylglyoxylic acid) (OR = 1.66, 95% CI: 1.22-2.27). The XGBoost model demonstrated strong predictive performance, with SHAP analysis highlighting DHBMA as a key predictor. Inverse associations were observed for AAMA (N-Acetyl-S-(2-carbamoylethyl)-L-cysteine) and CYMA (N-Acetyl-S-(2-cyanoethyl)-L-cysteine) in their highest quartiles.
This integrated approach identified significant associations between specific urinary VOC metabolites and CKD risk, particularly DHBMA. These findings underscore the role of environmental VOC exposure in CKD pathogenesis and may inform targeted prevention strategies.
慢性肾脏病(CKD)在全球约影响6.975亿人。挥发性有机化合物(VOCs)正成为潜在风险因素,但其与CKD的复杂关系可能被传统线性方法低估。本研究采用流行病学和可解释机器学习方法相结合的方式,探索尿VOC代谢物与CKD风险之间的关联。
分析了美国国家健康与营养检查调查(2011年 - 2020年大流行前3月)的数据,以检测15种尿VOC代谢物。分析方法包括多变量逻辑回归、LASSO回归以及五种机器学习模型:逻辑回归(LR)、随机森林(RF)、极端梯度提升(XGBoost)、K近邻(KNN)和多层感知器(MLP)。使用SHapley加性解释(SHAP)分析来增强模型的可解释性。
观察到与代谢物的显著关联,包括CEMA(N - 乙酰 - S -(2 - 羧乙基)- L - 半胱氨酸)(OR = 1.66,95%CI:1.17 - 2.37)、DHBMA(N - 乙酰 - S -(3,4 - 二羟基丁基)- L - 半胱氨酸)(OR = 1.95,95%CI:1.38 - 2.76)、HMPMA(N - 乙酰 - S -(3 - 羟丙基 - 1 - 甲基)- L - 半胱氨酸)(OR = 2.18,95%CI:1.53 - 3.10)和PGA(苯乙醛酸)(OR = 1.66,95%CI:1.22 - 2.27)。XGBoost模型表现出强大的预测性能,SHAP分析突出DHBMA作为关键预测因子。在最高四分位数中,观察到AAMA(N - 乙酰 - S -(2 - 氨甲酰基乙基)- L - 半胱氨酸)和CYMA(N - 乙酰 - S -(2 - 氰基乙基)- L - 半胱氨酸)呈负相关。
这种综合方法确定了特定尿VOC代谢物与CKD风险之间的显著关联,尤其是DHBMA。这些发现强调了环境VOC暴露在CKD发病机制中的作用,并可能为有针对性的预防策略提供依据。