Suppr超能文献

用于预测植物根系中传统和新兴芳香族污染物生物有效性的机器学习模型

Machine Learning Models for Predicting Bioavailability of Traditional and Emerging Aromatic Contaminants in Plant Roots.

作者信息

Li Siyuan, Shen Yuting, Gao Meng, Song Huatai, Ge Zhanpeng, Zhang Qiuyue, Xu Jiaping, Wang Yu, Sun Hongwen

机构信息

MOE Key Laboratory of Pollution Processes and Environmental Criteria, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China.

出版信息

Toxics. 2024 Oct 12;12(10):737. doi: 10.3390/toxics12100737.

Abstract

To predict the behavior of aromatic contaminants (ACs) in complex soil-plant systems, this study developed machine learning (ML) models to estimate the root concentration factor (RCF) of both traditional (e.g., polycyclic aromatic hydrocarbons, polychlorinated biphenyls) and emerging ACs (e.g., phthalate acid esters, aryl organophosphate esters). Four ML algorithms were employed, trained on a unified RCF dataset comprising 878 data points, covering 6 features of soil-plant cultivation systems and 98 molecular descriptors of 55 chemicals, including 29 emerging ACs. The gradient-boosted regression tree (GBRT) model demonstrated strong predictive performance, with a coefficient of determination (R) of 0.75, a mean absolute error (MAE) of 0.11, and a root mean square error (RMSE) of 0.22, as validated by five-fold cross-validation. Multiple explanatory analyses highlighted the significance of soil organic matter (SOM), plant protein and lipid content, exposure time, and molecular descriptors related to electronegativity distribution pattern (GATS8e) and double-ring structure (fr_bicyclic). An increase in SOM was found to decrease the overall RCF, while other variables showed strong correlations within specific ranges. This GBRT model provides an important tool for assessing the environmental behaviors of ACs in soil-plant systems, thereby supporting further investigations into their ecological and human exposure risks.

摘要

为预测芳香族污染物(ACs)在复杂土壤-植物系统中的行为,本研究开发了机器学习(ML)模型,以估算传统ACs(如多环芳烃、多氯联苯)和新兴ACs(如邻苯二甲酸酯、芳基有机磷酸酯)的根浓度因子(RCF)。采用了四种ML算法,在一个统一的RCF数据集上进行训练,该数据集包含878个数据点,涵盖土壤-植物种植系统的6个特征以及55种化学物质的98个分子描述符,其中包括29种新兴ACs。经五折交叉验证,梯度提升回归树(GBRT)模型显示出强大的预测性能,决定系数(R)为0.75,平均绝对误差(MAE)为0.11,均方根误差(RMSE)为0.22。多项解释性分析突出了土壤有机质(SOM)、植物蛋白质和脂质含量、暴露时间以及与电负性分布模式(GATS8e)和双环结构(fr_bicyclic)相关的分子描述符的重要性。研究发现SOM的增加会降低整体RCF,而其他变量在特定范围内显示出强相关性。该GBRT模型为评估ACs在土壤-植物系统中的环境行为提供了一个重要工具,从而有助于进一步研究它们的生态和人类暴露风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d109/11511036/f7b0e42cd8b1/toxics-12-00737-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验