Suppr超能文献

基于不平衡数据利用可解释机器学习技术对蚯蚓中农药生殖毒性进行预测建模

Predictive Modeling of Pesticides Reproductive Toxicity in Earthworms Using Interpretable Machine-Learning Techniques on Imbalanced Data.

作者信息

Kotli Mihkel, Piir Geven, Maran Uko

机构信息

Institute of Chemistry, University of Tartu, Ravila 14a, Tartu 50411, Estonia.

出版信息

ACS Omega. 2025 Jan 30;10(5):4732-4744. doi: 10.1021/acsomega.4c09719. eCollection 2025 Feb 11.

Abstract

The earthworm is a key indicator species in soil ecosystems. This makes the reproductive toxicity of chemical compounds to earthworms a desired property of determination and makes computational models necessary for descriptive and predictive purposes. Thus, the aim was to develop an advanced Quantitative Structure-Activity Relationship modeling approach for this complex property with imbalanced data. The approach integrated gradient-boosted decision trees as classifiers with a genetic algorithm for feature selection and Bayesian optimization for hyperparameter tuning. An additional goal was to analyze and interpret, using SHAP values, the structural features encoded by the molecular descriptors that contribute to pesticide toxicity and nontoxicity, the most notable of which are solvation entropy and a number of hydrolyzable bonds. The final model was constructed as a stacked ensemble of models and combined the strengths of the individual models. Evaluation of this model with an external test set of 147 compounds demonstrated a well-defined applicability domain and sufficient predictive capabilities with a Balanced Accuracy of 77%. The model representation follows FAIR principles and is available on QsarDB.org.

摘要

蚯蚓是土壤生态系统中的关键指示物种。这使得化合物对蚯蚓的生殖毒性成为一个需要测定的特性,也使得用于描述和预测目的的计算模型成为必要。因此,目标是针对这种具有不平衡数据的复杂特性开发一种先进的定量构效关系建模方法。该方法将梯度提升决策树作为分类器,与用于特征选择的遗传算法和用于超参数调整的贝叶斯优化相结合。另一个目标是使用SHAP值分析和解释分子描述符编码的结构特征,这些特征有助于农药的毒性和非毒性,其中最显著的是溶剂化熵和一些可水解键。最终模型构建为模型的堆叠集成,结合了各个模型的优势。用147种化合物的外部测试集对该模型进行评估,结果表明其具有明确的适用范围和足够的预测能力,平衡准确率为77%。该模型表示遵循FAIR原则,可在QsarDB.org上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ebc/11822515/6d8a4747ddef/ao4c09719_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验