Suppr超能文献

基于特征优化与机器学习的塔里木盆地东缘绿洲土壤镉预测及健康风险评估

[Soil Cadmium Prediction and Health Risk Assessment of an Oasis on the Eastern Edge of the Tarim Basin Based on Feature Optimization and Machine Learning].

作者信息

Liu Jing-Yu, Li Ruo-Yi, Liang Yong-Chun, Liu Lei, Yin Fang, Tang Su, He Lin-Sen, Zhang Yi

机构信息

School of Earth Science and Resources, Chang'an University, Xi'an 710054, China.

Center of Urumqi Comprehensive Survey Natural Resources, China Geological Survey, Urumqi 830057, China.

出版信息

Huan Jing Ke Xue. 2024 Aug 8;45(8):4802-4811. doi: 10.13227/j.hjkx.202308010.

Abstract

Soil heavy metal pollution poses a serious threat to food security, human health, and soil ecosystems. Based on 644 soil samples collected from a typical oasis located at the eastern margin of the Tarim Basin, a series of models, namely, multiple linear regression (LR), neural network (BP), random forest (RF), support vector machine (SVM), and radial basis function (RBF), were built to predict the soil heavy metal content. The optimal prediction result was obtained and utilized to analyze the spatial distribution features of heavy metal contamination and relevant health risks. The outcomes demonstrated that: ① The average Cd content in the study area was 0.14 mg·kg, which was 1.17 times the soil background value of Xinjiang, making it the primary factor of soil heavy metal contamination in the area. Additionally, the carcinogenicity risk coefficients of Cd for both adults and children were less than 10, indicating that there were no significant long-term health risks for humans in the area. ② The estimation accuracies of the five inversion models were compared, and the validation set of the RF model had an value of 0.763 7, which was the highest among the five models. Additionally, the RMSE, MAE, and MBE of the RF model were the smallest among the five models. Therefore, the predicted values of the RF model were most consistent with the measured values of the soil Cd content. The predicted map of soil Cd distribution derived from the RF model coincided best with the interpolation map. ③ The RF model outperformed the other four models in predicting health risks associated with the soil Cd element for both adults and children, resulting in better prediction results. Comparatively, the predicted values of the LR model in the validation set varied greatly, leading to unreliable results. It was demonstrated that the RF was the best model for predicting soil Cd content and evaluating health risks in the study area, considering its superior generalization capability and anti-overfitting ability.

摘要

土壤重金属污染对粮食安全、人类健康和土壤生态系统构成严重威胁。基于从塔里木盆地东缘典型绿洲采集的644个土壤样本,构建了一系列模型,即多元线性回归(LR)、神经网络(BP)、随机森林(RF)、支持向量机(SVM)和径向基函数(RBF),用于预测土壤重金属含量。获得了最优预测结果,并利用该结果分析了重金属污染的空间分布特征及相关健康风险。结果表明:①研究区域土壤镉平均含量为0.14mg·kg,是新疆土壤背景值的1.17倍,是该区域土壤重金属污染的主要因素。此外,镉对成人和儿童的致癌风险系数均小于10,表明该区域人群不存在显著的长期健康风险。②比较了5种反演模型的估算精度,RF模型验证集的 值为0.763 7,在5种模型中最高。此外,RF模型的RMSE、MAE和MBE在5种模型中最小。因此,RF模型的预测值与土壤镉含量实测值最为一致。RF模型得出的土壤镉分布预测图与插值图吻合度最好。③RF模型在预测成人和儿童土壤镉元素相关健康风险方面优于其他4种模型,预测效果更好。相比之下,LR模型验证集的预测值变化较大,结果不可靠。结果表明,考虑到RF模型具有较强的泛化能力和抗过拟合能力,是研究区域土壤镉含量预测和健康风险评估的最佳模型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验