• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Validation subset selections for extrapolation oriented QSPAR models.

作者信息

Szántai-Kis Csaba, Kövesdi István, Kéri György, Orfi László

机构信息

Cooperative Research Center, Semmelweis University, Pf 131, Budapest 5, Hungary, 1367.

出版信息

Mol Divers. 2003;7(1):37-43. doi: 10.1023/b:modi.0000006538.99122.00.

DOI:10.1023/b:modi.0000006538.99122.00
PMID:14768902
Abstract

One of the most important features of QSPAR models is their predictive ability. The predictive ability of QSPAR models should be checked by external validation. In this work we examined three different types of external validation set selection methods for their usefulness in in-silico screening. The usefulness of the selection methods was studied in such a way that: 1) We generated thousands of QSPR models and stored them in 'model banks'. 2) We selected a final top model from the model banks based on three different validation set selection methods. 3) We predicted large data sets, which we called 'chemical universe sets', and calculated the corresponding SEPs. The models were generated from small fractions of the available water solubility data during a GA Variable Subset Selection procedure. The external validation sets were constructed by random selections, uniformly distributed selections or by perimeter-oriented selections. We found that the best performing models on the perimeter-oriented external validation sets usually gave the best validation results when the remaining part of the available data was overwhelmingly large, i.e., when the model had to make a lot of extrapolations. We also compared the top final models obtained from external validation set selection methods in three independent and different sizes of 'chemical universe sets'.

摘要

相似文献

1
Validation subset selections for extrapolation oriented QSPAR models.
Mol Divers. 2003;7(1):37-43. doi: 10.1023/b:modi.0000006538.99122.00.
2
PV-Based Training Set Selection Improves the External Predictability of QSAR/QSPR Models.基于光伏的训练集选择提高了QSAR/QSPR模型的外部预测能力。
J Chem Inf Model. 2017 May 22;57(5):1055-1067. doi: 10.1021/acs.jcim.7b00029. Epub 2017 Apr 27.
3
Statistical external validation and consensus modeling: a QSPR case study for Koc prediction.统计外部验证与共识建模:用于预测辛醇-水分配系数(Koc)的定量构效关系案例研究
J Mol Graph Model. 2007 Mar;25(6):755-66. doi: 10.1016/j.jmgm.2006.06.005. Epub 2006 Aug 4.
4
Does rational selection of training and test sets improve the outcome of QSAR modeling?训练集和测试集的合理选择是否能提高 QSAR 建模的结果?
J Chem Inf Model. 2012 Oct 22;52(10):2570-8. doi: 10.1021/ci300338w. Epub 2012 Oct 3.
5
Beware of External Validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.谨防外部验证!——QSAR建模中几种验证技术的比较研究。
Curr Comput Aided Drug Des. 2018;14(4):284-291. doi: 10.2174/1573409914666180426144304.
6
Prediction of aqueous solubility based on large datasets using several QSPR models utilizing topological structure representation.基于大型数据集,利用多种采用拓扑结构表示法的定量构效关系(QSPR)模型预测水溶解度。
Chem Biodivers. 2004 Nov;1(11):1829-41. doi: 10.1002/cbdv.200490137.
7
Application of GA-MLR for QSAR Modeling of the Arylthioindole Class of Tubulin Polymerization Inhibitors as Anticancer Agents.遗传算法-多元线性回归在作为抗癌剂的芳基硫代吲哚类微管蛋白聚合抑制剂定量构效关系建模中的应用。
Anticancer Agents Med Chem. 2017;17(4):552-565. doi: 10.2174/1871520616666160811162105.
8
Comparative studies on some metrics for external validation of QSPR models.比较研究 QSPR 模型外部验证的一些指标。
J Chem Inf Model. 2012 Feb 27;52(2):396-408. doi: 10.1021/ci200520g. Epub 2012 Jan 17.
9
Stochastic versus stepwise strategies for quantitative structure-activity relationship generation--how much effort may the mining for successful QSAR models take?定量构效关系生成的随机策略与逐步策略——挖掘成功的定量构效关系模型需要付出多少努力?
J Chem Inf Model. 2007 May-Jun;47(3):927-39. doi: 10.1021/ci600476r. Epub 2007 May 5.
10
A comprehensive support vector machine binary hERG classification model based on extensive but biased end point hERG data sets.基于广泛但存在偏倚的终点 hERG 数据集的全面支持向量机二进制 hERG 分类模型。
Chem Res Toxicol. 2011 Jun 20;24(6):934-49. doi: 10.1021/tx200099j. Epub 2011 May 6.

引用本文的文献

1
Real-World Molecular Out-Of-Distribution: Specification and Investigation.真实世界的分子离群值:规范与研究。
J Chem Inf Model. 2024 Feb 12;64(3):697-711. doi: 10.1021/acs.jcim.3c01774. Epub 2024 Feb 1.

本文引用的文献

1
Comparison of predictive ability of water solubility QSPR models generated by MLR, PLS and ANN methods.由多元线性回归(MLR)、偏最小二乘法(PLS)和人工神经网络(ANN)方法生成的水溶性定量构效关系(QSPR)模型预测能力的比较。
Mini Rev Med Chem. 2004 Feb;4(2):167-77. doi: 10.2174/1389557043487466.
2
Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection.基于实验数据集的多样性采样进行训练集和测试集选择的预测性定量构效关系建模。
Mol Divers. 2002;5(4):231-43. doi: 10.1023/a:1021372108686.
3
Reliability of logP predictions based on calculated molecular descriptors: a critical review.
基于计算分子描述符的logP预测可靠性:批判性综述。
Curr Med Chem. 2002 Oct;9(20):1819-29. doi: 10.2174/0929867023369042.
4
Estimating the water solubilities of crystalline compounds from their chemical structures alone.仅根据晶体化合物的化学结构估算其水溶性。
J Chem Inf Comput Sci. 2001 Sep-Oct;41(5):1355-9. doi: 10.1021/ci0102822.
5
Prediction of aqueous solubility of heteroatom-containing organic compounds from molecular structure.
J Chem Inf Comput Sci. 2001 Sep-Oct;41(5):1237-47. doi: 10.1021/ci010035y.
6
Prediction of aqueous solubility of organic compounds by the general solubility equation (GSE).用通用溶解度方程(GSE)预测有机化合物的水溶性。
J Chem Inf Comput Sci. 2001 Sep-Oct;41(5):1208-17. doi: 10.1021/ci010287z.
7
A fuzzy ARTMAP based on quantitative structure-property relationships (QSPRs) for predicting aqueous solubility of organic compounds.一种基于定量结构-性质关系(QSPRs)的模糊ARTMAP,用于预测有机化合物的水溶性。
J Chem Inf Comput Sci. 2001 Sep-Oct;41(5):1177-207. doi: 10.1021/ci010323u.
8
Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology.基于分子拓扑学估算多种有机化合物的水溶性。
J Chem Inf Comput Sci. 2000 May;40(3):773-7. doi: 10.1021/ci9901338.
9
Aqueous solubility prediction of drugs based on molecular topology and neural network modeling.基于分子拓扑学和神经网络建模的药物水溶性预测
J Chem Inf Comput Sci. 1998 May-Jun;38(3):450-6. doi: 10.1021/ci970100x.