• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于适用域的堆叠泛化在体外毒理学数据上优于简单 QSAR。

Stacked Generalization with Applicability Domain Outperforms Simple QSAR on in Vitro Toxicological Data.

机构信息

University Côte d'Azur, I3S Laboratory , UMR CNRS 7271, CS 40121, 06903 Sophia Antipolis Cedex, France.

Bayer SAS , 06903 Sophia Antipolis Cedex, France.

出版信息

J Chem Inf Model. 2019 Apr 22;59(4):1486-1496. doi: 10.1021/acs.jcim.8b00553. Epub 2019 Feb 27.

DOI:10.1021/acs.jcim.8b00553
PMID:30735402
Abstract

The development of in silico tools able to predict bioactivity and toxicity of chemical substances is a powerful solution envisioned to assess toxicity as early as possible. To enable the development of such tools, the ToxCast program has generated and made publicly available in vitro bioactivity data for thousands of compounds. The goal of the present study is to characterize and explore the data from ToxCast in terms of Machine Learning capability. For this, a large scale analysis on the entire database has been performed to build models to predict bioactivities measured in in vitro assays. Simple classical QSAR algorithms (ANN, SVM, LDA, random forest, and Bayesian) were first applied on the data, and the results of these algorithms suggested that they do not seem to be well-suited for data sets with a high proportion of inactive compounds. The study then showed for the first time that the use of an ensemble method named "Stacked generalization" could improve the model performance on this type of data. Indeed, for 61% of 483 models, the Stacked method led to models with higher performance. Moreover, the combination of this ensemble method with an applicability domain filter allows one to assess the reliability of the predictions for further compound prioritization. In particular we showed that for 50% of the models, the ROC score is better if we do not consider the compounds that are not within the applicability domain.

摘要

开发能够预测化学物质生物活性和毒性的计算工具是一种强大的解决方案,旨在尽早评估毒性。为了能够开发此类工具,ToxCast 计划已经生成并公开了数千种化合物的体外生物活性数据。本研究的目的是从机器学习能力的角度对 ToxCast 的数据进行特征描述和探索。为此,对整个数据库进行了大规模分析,以构建预测体外测定中生物活性的模型。首先将简单的经典 QSAR 算法(人工神经网络、支持向量机、线性判别分析、随机森林和贝叶斯)应用于数据,这些算法的结果表明,它们似乎不太适合具有高比例非活性化合物的数据集。该研究首次表明,使用名为“堆叠泛化”的集成方法可以提高此类数据的模型性能。实际上,在 483 个模型中的 61%中,堆叠方法使模型的性能更高。此外,该集成方法与适用性域过滤器的结合可以评估进一步化合物优先级排序的预测可靠性。特别是我们表明,如果我们不考虑不在适用性域内的化合物,对于 50%的模型,ROC 得分会更好。

相似文献

1
Stacked Generalization with Applicability Domain Outperforms Simple QSAR on in Vitro Toxicological Data.基于适用域的堆叠泛化在体外毒理学数据上优于简单 QSAR。
J Chem Inf Model. 2019 Apr 22;59(4):1486-1496. doi: 10.1021/acs.jcim.8b00553. Epub 2019 Feb 27.
2
Predicting hepatotoxicity using ToxCast in vitro bioactivity and chemical structure.利用ToxCast体外生物活性和化学结构预测肝毒性。
Chem Res Toxicol. 2015 Apr 20;28(4):738-51. doi: 10.1021/tx500501h. Epub 2015 Mar 9.
3
In Silico Study of In Vitro GPCR Assays by QSAR Modeling.通过定量构效关系(QSAR)建模对体外G蛋白偶联受体(GPCR)分析进行计算机模拟研究。
Methods Mol Biol. 2016;1425:361-81. doi: 10.1007/978-1-4939-3609-0_16.
4
Critically Assessing the Predictive Power of QSAR Models for Human Liver Microsomal Stability.批判性评估定量构效关系(QSAR)模型对人肝微粒体稳定性的预测能力。
J Chem Inf Model. 2015 Aug 24;55(8):1566-75. doi: 10.1021/acs.jcim.5b00255. Epub 2015 Jul 29.
5
Targeting HIV/HCV Coinfection Using a Machine Learning-Based Multiple Quantitative Structure-Activity Relationships (Multiple QSAR) Method.基于机器学习的多重定量构效关系(多重 QSAR)方法靶向 HIV/HCV 共感染。
Int J Mol Sci. 2019 Jul 22;20(14):3572. doi: 10.3390/ijms20143572.
6
Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods.通过定量构效关系和机器学习方法对来自雌激素受体测定的大量环境化学物质进行二元分类。
J Chem Inf Model. 2013 Dec 23;53(12):3244-61. doi: 10.1021/ci400527b. Epub 2013 Dec 11.
7
ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity.药物研发中的ADMET评估。第17部分:化学诱导呼吸毒性的定量和定性预测模型的开发。
Mol Pharm. 2017 Jul 3;14(7):2407-2421. doi: 10.1021/acs.molpharmaceut.7b00317. Epub 2017 Jun 21.
8
Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.估计机器学习定量构效关系模型的适用域:关于药物发现分子水溶性的研究
J Comput Aided Mol Des. 2007 Sep;21(9):485-98. doi: 10.1007/s10822-007-9125-z. Epub 2007 Jul 14.
9
Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds.基于分子相似性的域适用性指标可有效识别域外化合物。
J Chem Inf Model. 2019 Jan 28;59(1):181-189. doi: 10.1021/acs.jcim.8b00597. Epub 2018 Nov 19.
10
General Approach to Estimate Error Bars for Quantitative Structure-Activity Relationship Predictions of Molecular Activity.定量构效关系预测分子活性的误差估计的一般方法。
J Chem Inf Model. 2018 Aug 27;58(8):1561-1575. doi: 10.1021/acs.jcim.8b00114. Epub 2018 Jul 17.

引用本文的文献

1
A Novel Knowledge Fusion Ensemble for Diagnostic Differentiation of Pediatric Pneumonia and Acute Bronchitis.一种用于小儿肺炎与急性支气管炎诊断鉴别的新型知识融合集成方法。
Diagnostics (Basel). 2025 Sep 6;15(17):2258. doi: 10.3390/diagnostics15172258.
2
Predictive Modeling of Pesticides Reproductive Toxicity in Earthworms Using Interpretable Machine-Learning Techniques on Imbalanced Data.基于不平衡数据利用可解释机器学习技术对蚯蚓中农药生殖毒性进行预测建模
ACS Omega. 2025 Jan 30;10(5):4732-4744. doi: 10.1021/acsomega.4c09719. eCollection 2025 Feb 11.
3
MolToxPred: small molecule toxicity prediction using machine learning approach.
MolToxPred:使用机器学习方法进行小分子毒性预测。
RSC Adv. 2024 Jan 30;14(6):4201-4220. doi: 10.1039/d3ra07322j. eCollection 2024 Jan 23.
4
StackBRAF: A Large-Scale Stacking Ensemble Learning for BRAF Affinity Prediction.StackBRAF:用于BRAF亲和力预测的大规模堆叠集成学习
ACS Omega. 2023 Jun 1;8(23):20881-20891. doi: 10.1021/acsomega.3c01641. eCollection 2023 Jun 13.
5
Recent Advances in In Silico Target Fishing.计算机辅助药物靶点发现的最新进展
Molecules. 2021 Aug 24;26(17):5124. doi: 10.3390/molecules26175124.
6
Integration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance.机器学习和基因组代谢建模的整合确定了辐射抗性的多组学生物标志物。
Nat Commun. 2021 May 11;12(1):2700. doi: 10.1038/s41467-021-22989-1.
7
QSAR Models for Active Substances against Using Disk-Diffusion Test Data.基于纸片扩散试验数据的抗活性物质 QSAR 模型
Molecules. 2021 Mar 19;26(6):1734. doi: 10.3390/molecules26061734.
8
Surface-Related Features Responsible for Cytotoxic Behavior of MXenes Layered Materials Predicted with Machine Learning Approach.通过机器学习方法预测的与MXenes层状材料细胞毒性行为相关的表面特征。
Materials (Basel). 2020 Jul 10;13(14):3083. doi: 10.3390/ma13143083.
9
STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products.STarFish:一种堆叠集成目标捕捞方法及其在天然产物中的应用。
J Chem Inf Model. 2019 Nov 25;59(11):4906-4920. doi: 10.1021/acs.jcim.9b00489. Epub 2019 Oct 24.
10
Artificial Intelligence for Drug Toxicity and Safety.人工智能在药物毒性和安全性方面的应用。
Trends Pharmacol Sci. 2019 Sep;40(9):624-635. doi: 10.1016/j.tips.2019.07.005. Epub 2019 Aug 2.