Suppr超能文献

随机森林方法在定量结构-活性关系预测水生毒性中的应用。

Application of random forest approach to QSAR prediction of aquatic toxicity.

机构信息

Laboratory on Theoretical Chemistry, A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa 65080, Ukraine.

出版信息

J Chem Inf Model. 2009 Nov;49(11):2481-8. doi: 10.1021/ci900203n.

Abstract

This work is devoted to the application of the random forest approach to QSAR analysis of aquatic toxicity of chemical compounds tested on Tetrahymena pyriformis. The simplex representation of the molecular structure approach implemented in HiT QSAR Software was used for descriptors generation on a two-dimensional level. Adequate models based on simplex descriptors and the RF statistical approach were obtained on a modeling set of 644 compounds. Model predictivity was validated on two external test sets of 339 and 110 compounds. The high impact of lipophilicity and polarizability of investigated compounds on toxicity was determined. It was shown that RF models were tolerant for insertion of irrelevant descriptors as well as for randomization of some part of toxicity values that were representing a "noise". The fast procedure of optimization of the number of trees in the random forest has been proposed. The discussed RF model had comparable or better statistical characteristics than the corresponding PLS or KNN models.

摘要

这项工作致力于将随机森林方法应用于在四膜虫上进行毒性测试的化合物的水生毒性的定量构效关系(QSAR)分析。HiT QSAR 软件中实现的分子结构单形表示方法用于在二维水平上生成描述符。在 644 种化合物的建模集中获得了基于单形描述符和 RF 统计方法的适当模型。在两个外部测试集(339 种和 110 种化合物)上验证了模型的预测能力。确定了研究化合物的脂溶性和极化率对毒性的高影响。结果表明,RF 模型对插入不相关描述符以及对表示“噪声”的部分毒性值的随机化具有耐受性。已经提出了随机森林中树的数量的快速优化过程。所讨论的 RF 模型具有与相应的 PLS 或 KNN 模型可比或更好的统计特性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验