Department of Pathology, Affiliated Hospital of Nantong University, Nantong, China.
Department of Thoracic Surgery, Affiliated Hospital of Nantong University, Nantong, China.
Front Public Health. 2024 Aug 1;12:1405533. doi: 10.3389/fpubh.2024.1405533. eCollection 2024.
Limited investigation is available on the correlation between environmental phenols' exposure and estimated glomerular filtration rate (eGFR). Our target is established a robust and explainable machine learning (ML) model that associates environmental phenols' exposure with eGFR.
Our datasets for constructing the associations between environmental phenols' and eGFR were collected from the National Health and Nutrition Examination Survey (NHANES, 2013-2016). Five ML models were contained and fine-tuned to eGFR regression by phenols' exposure. Regression evaluation metrics were used to extract the limitation of the models. The most effective model was then utilized for regression, with interpretation of its features carried out using shapley additive explanations (SHAP) and the game theory python package to represent the model's regression capacity.
The study identified the top-performing random forest (RF) regressor with a mean absolute error of 0.621 and a coefficient of determination of 0.998 among 3,371 participants. Six environmental phenols with eGFR in linear regression models revealed that the concentrations of triclosan (TCS) and bisphenol S (BPS) in urine were positively correlated with eGFR, and the correlation coefficients were = 0.010 ( = 0.026) and = 0.007 ( = 0.004) respectively. SHAP values indicate that BPS (1.38), bisphenol F (BPF) (0.97), 2,5-dichlorophenol (0.87), TCS (0.78), BP3 (0.60), bisphenol A (BPA) (0.59) and 2,4-dichlorophenol (0.47) in urinary contributed to the model.
The RF model was efficient in identifying a correlation between phenols' exposure and eGFR among United States NHANES 2013-2016 participants. The findings indicate that BPA, BPF, and BPS are inversely associated with eGFR.
关于环境酚类物质暴露与估计肾小球滤过率(eGFR)之间的相关性,目前的研究还很有限。本研究旨在建立一个稳健且可解释的机器学习(ML)模型,以关联环境酚类物质暴露与 eGFR。
我们从国家健康和营养调查(NHANES,2013-2016 年)中收集了构建环境酚类物质与 eGFR 之间关联的数据集。包含了五个 ML 模型,并通过酚类物质暴露对 eGFR 回归进行了微调。回归评估指标用于提取模型的局限性。然后,利用最有效的模型进行回归,并使用 SHAP (SHapley Additive exPlanations)和博弈论 Python 包对其特征进行解释,以表示模型的回归能力。
在 3371 名参与者中,研究确定了表现最佳的随机森林(RF)回归器,其平均绝对误差为 0.621,决定系数为 0.998。在线性回归模型中,有 6 种环境酚类物质与 eGFR 相关,尿液中三氯生(TCS)和双酚 S(BPS)的浓度与 eGFR 呈正相关,相关系数分别为 =0.010(=0.026)和 =0.007(=0.004)。SHAP 值表明,尿液中的 BPS(1.38)、双酚 F(BPF)(0.97)、2,5-二氯苯酚(0.87)、TCS(0.78)、BP3(0.60)、双酚 A(BPA)(0.59)和 2,4-二氯苯酚(0.47)对模型有贡献。
RF 模型在美国 NHANES 2013-2016 参与者中,能够有效地识别酚类物质暴露与 eGFR 之间的相关性。研究结果表明,BPA、BPF 和 BPS 与 eGFR 呈负相关。