Suppr超能文献

一种使用机器学习方法对大量环境化学物质中的独特雌激素受体活性进行三分法分类。

A ternary classification using machine learning methods of distinct estrogen receptor activities within a large collection of environmental chemicals.

机构信息

Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Environment, Zhejiang University of Technology, Hangzhou 310032, China.

College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.

出版信息

Sci Total Environ. 2017 Feb 15;580:1268-1275. doi: 10.1016/j.scitotenv.2016.12.088. Epub 2016 Dec 20.

Abstract

Endocrine-disrupting chemicals (EDCs), which can threaten ecological safety and be harmful to human beings, have been cause for wide concern. There is a high demand for efficient methodologies for evaluating potential EDCs in the environment. Herein an evaluation platform was developed using novel and statistically robust ternary models via different machine learning models (i.e., linear discriminant analysis, classification and regression tree, and support vector machines). The platform is aimed at effectively classifying chemicals with agonistic, antagonistic, or no estrogen receptor (ER) activities. A total of 440 chemicals from the literature were selected to derive and optimize the three-class model. One hundred and nine new chemicals appeared on the 2014 EPA list for EDC screening, which were used to assess the predictive performances by comparing the E-screen results with the predicted results of the classification models. The best model was obtained using support vector machines (SVM) which recognized agonists and antagonists with accuracies of 76.6% and 75.0%, respectively, on the test set (with an overall predictive accuracy of 75.2%), and achieved a 10-fold cross-validation (CV) of 73.4%. The external predicted accuracy validated by the E-screen assay was 87.5%, which demonstrated the application value for a virtual alert for EDCs with ER agonistic or antagonistic activities. It was demonstrated that the ternary computational model could be used as a faster and less expensive method to identify EDCs that act through nuclear receptors, and to classify these chemicals into different mechanism groups.

摘要

内分泌干扰化学物质(EDCs)会威胁生态安全,对人类健康有害,因此受到广泛关注。人们迫切需要有效的方法来评估环境中的潜在 EDCs。本文通过不同的机器学习模型(即线性判别分析、分类回归树和支持向量机),建立了一个基于新型统计学稳健三元模型的评估平台。该平台旨在有效区分具有雌激素受体(ER)激动、拮抗或无活性的化学物质。从文献中选择了 440 种化学物质来推导和优化三分类模型。109 种新的化学物质出现在 2014 年 EPA 的 EDC 筛选清单上,通过比较 E-screen 结果和分类模型的预测结果,用这些物质来评估预测性能。使用支持向量机(SVM)获得了最佳模型,该模型对测试集中的激动剂和拮抗剂的识别准确率分别为 76.6%和 75.0%(总体预测准确率为 75.2%),10 倍交叉验证(CV)的准确率为 73.4%。通过 E-screen 测定验证的外部预测准确率为 87.5%,这表明该模型在具有 ER 激动或拮抗活性的虚拟 EDC 警报方面具有应用价值。结果表明,三元计算模型可以作为一种更快、更经济的方法来识别通过核受体起作用的 EDCs,并将这些化学物质分类到不同的作用机制组中。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验