用于离子载体金属络合定量结构-性质关系研究的线性和非线性方法的基准测试。

Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores.

作者信息

Tetko Igor V, Solov'ev Vitaly P, Antonov Alexey V, Yao Xiaojun, Doucet Jean Pierre, Fan Botao, Hoonakker Frank, Fourches Denis, Jost Piere, Lachiche Nicolas, Varnek Alexandre

机构信息

Institute of Bioorganic & Petrochemistry, Kiev, Ukraine.

出版信息

J Chem Inf Model. 2006 Mar-Apr;46(2):808-19. doi: 10.1021/ci0504216.

DOI:10.1021/ci0504216

PMID:16563012

Abstract

A benchmark of several popular methods, Associative Neural Networks (ANN), Support Vector Machines (SVM), k Nearest Neighbors (kNN), Maximal Margin Linear Programming (MMLP), Radial Basis Function Neural Network (RBFNN), and Multiple Linear Regression (MLR), is reported for quantitative-structure property relationships (QSPR) of stability constants logK1 for the 1:1 (M:L) and logbeta2 for 1:2 complexes of metal cations Ag+ and Eu3+ with diverse sets of organic molecules in water at 298 K and ionic strength 0.1 M. The methods were tested on three types of descriptors: molecular descriptors including E-state values, counts of atoms determined for E-state atom types, and substructural molecular fragments (SMF). Comparison of the models was performed using a 5-fold external cross-validation procedure. Robust statistical tests (bootstrap and Kolmogorov-Smirnov statistics) were employed to evaluate the significance of calculated models. The Wilcoxon signed-rank test was used to compare the performance of methods. Individual structure-complexation property models obtained with nonlinear methods demonstrated a significantly better performance than the models built using multilinear regression analysis (MLRA). However, the averaging of several MLRA models based on SMF descriptors provided as good of a prediction as the most efficient nonlinear techniques. Support Vector Machines and Associative Neural Networks contributed in the largest number of significant models. Models based on fragments (SMF descriptors and E-state counts) had higher prediction ability than those based on E-state indices. The use of SMF descriptors and E-state counts provided similar results, whereas E-state indices lead to less significant models. The current study illustrates the difficulties of quantitative comparison of different methods: conclusions based only on one data set without appropriate statistical tests could be wrong.

摘要

报告了几种常用方法的基准测试结果，这些方法包括关联神经网络（ANN）、支持向量机（SVM）、k近邻算法（kNN）、最大边缘线性规划（MMLP）、径向基函数神经网络（RBFNN）和多元线性回归（MLR），用于研究金属阳离子Ag+和Eu3+与多种有机分子在298K、离子强度为0.1M的水中形成的1:1（M:L）配合物的稳定常数logK1以及1:2配合物的logbeta2的定量结构-性质关系（QSPR）。这些方法在三种类型的描述符上进行了测试：分子描述符，包括E态值、根据E态原子类型确定的原子计数以及子结构分子片段（SMF）。使用5折外部交叉验证程序对模型进行比较。采用稳健的统计检验（自助法和柯尔莫哥洛夫-斯米尔诺夫统计量）来评估计算模型的显著性。使用威尔科克森符号秩检验来比较方法的性能。用非线性方法得到的个体结构-络合性质模型的性能明显优于使用多元线性回归分析（MLRA）构建的模型。然而，基于SMF描述符的多个MLRA模型的平均预测效果与最有效的非线性技术相当。支持向量机和关联神经网络在大量显著模型中贡献最大。基于片段（SMF描述符和E态计数）的模型比基于E态指数的模型具有更高的预测能力。使用SMF描述符和E态计数得到的结果相似，而E态指数导致的模型显著性较低。当前研究说明了不同方法进行定量比较的困难：仅基于一个数据集而没有适当统计检验得出的结论可能是错误的。

相似文献

Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores.用于离子载体金属络合定量结构-性质关系研究的线性和非线性方法的基准测试。

J Chem Inf Model. 2006 Mar-Apr;46(2):808-19. doi: 10.1021/ci0504216.

Exhaustive QSPR studies of a large diverse set of ionic liquids: how accurately can we predict melting points?对大量不同种类离子液体进行的详尽定量构效关系研究：我们能多准确地预测熔点？

J Chem Inf Model. 2007 May-Jun;47(3):1111-22. doi: 10.1021/ci600493x. Epub 2007 Mar 24.

Quantitative predictions of gas chromatography retention indexes with support vector machines, radial basis neural networks and multiple linear regression.利用支持向量机、径向基神经网络和多元线性回归对气相色谱保留指数进行定量预测。

Anal Chim Acta. 2008 Feb 18;609(1):24-36. doi: 10.1016/j.aca.2008.01.003. Epub 2008 Jan 8.

QSPR study of Setschenow constants of organic compounds using MLR, ANN, and SVM analyses.采用多元线性回归（MLR）、人工神经网络（ANN）和支持向量机（SVM）分析对有机化合物的 Setschenow 常数进行 QSPR 研究。

J Comput Chem. 2011 Nov 30;32(15):3241-52. doi: 10.1002/jcc.21907. Epub 2011 Aug 12.

Support vector machines-based quantitative structure-property relationship for the prediction of heat capacity.基于支持向量机的定量结构-性质关系用于热容预测

J Chem Inf Comput Sci. 2004 Jul-Aug;44(4):1267-74. doi: 10.1021/ci049934n.

QSPR modeling of soil sorption coefficients (K(OC)) of pesticides using SPA-ANN and SPA-MLR.使用逐步回归分析-人工神经网络（SPA-ANN）和逐步回归分析-多元线性回归（SPA-MLR）对农药土壤吸附系数（K(OC)）进行定量结构-性质关系（QSPR）建模。

J Agric Food Chem. 2009 Aug 12;57(15):7153-8. doi: 10.1021/jf9008839.

Three new consensus QSAR models for the prediction of Ames genotoxicity.用于预测埃姆斯致突变性的三种新的一致性定量构效关系模型。

Mutagenesis. 2004 Sep;19(5):365-77. doi: 10.1093/mutage/geh043.

QSAR modeling of human serum protein binding with several modeling techniques utilizing structure-information representation.利用结构信息表示法，采用多种建模技术对人血清蛋白结合进行定量构效关系建模。

J Med Chem. 2006 Nov 30;49(24):7169-81. doi: 10.1021/jm051245v.

In silico log P prediction for a large data set with support vector machines, radial basis neural networks and multiple linear regression.使用支持向量机、径向基神经网络和多元线性回归对大数据集进行计算机模拟的log P预测。

Chem Biol Drug Des. 2009 Aug;74(2):142-7. doi: 10.1111/j.1747-0285.2009.00840.x. Epub 2009 Jun 22.

Linear indices of the "molecular pseudograph's atom adjacency matrix": definition, significance-interpretation, and application to QSAR analysis of flavone derivatives as HIV-1 integrase inhibitors.“分子伪图原子邻接矩阵”的线性指标：定义、意义阐释及其在黄酮衍生物作为HIV-1整合酶抑制剂的定量构效关系分析中的应用

J Chem Inf Comput Sci. 2004 Nov-Dec;44(6):2010-26. doi: 10.1021/ci049950k.

引用本文的文献

Machine learning-based analysis of overall stability constants of metal-ligand complexes.基于机器学习的金属-配体配合物总稳定常数分析

Sci Rep. 2022 Jul 25;12(1):11159. doi: 10.1038/s41598-022-15300-9.

The role of machine learning method in the synthesis and biological ınvestigation of heterocyclic compounds.机器学习方法在杂环化合物的合成和生物研究中的作用。

Mol Divers. 2022 Jun;26(3):1875-1892. doi: 10.1007/s11030-021-10264-w. Epub 2021 Oct 20.

Applied machine learning for predicting the lanthanide-ligand binding affinities.应用机器学习预测镧系元素-配体结合亲和力。

Sci Rep. 2020 Aug 31;10(1):14322. doi: 10.1038/s41598-020-71255-9.

Predictive cartography of metal binders using generative topographic mapping.使用生成地形映射对金属粘合剂进行预测制图。

J Comput Aided Mol Des. 2017 Aug;31(8):701-714. doi: 10.1007/s10822-017-0033-6. Epub 2017 Jul 7.

Modeling the Biodegradability of Chemical Compounds Using the Online CHEmical Modeling Environment (OCHEM).使用在线化学建模环境（OCHEM）对化合物的生物降解性进行建模。

Mol Inform. 2014 Jan;33(1):73-85. doi: 10.1002/minf.201300030. Epub 2013 Nov 28.

The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS.用于预测与从专利中挖掘出的几十万种化合物相关的熔点和热解点数据的模型的开发。

J Cheminform. 2016 Jan 22;8:2. doi: 10.1186/s13321-016-0113-y. eCollection 2016.

QSPR ensemble modelling of the 1:1 and 1:2 complexation of Co²⁺, Ni²⁺, and Cu²⁺ with organic ligands: relationships between stability constants.Co²⁺、Ni²⁺和Cu²⁺与有机配体1:1和1:2络合的QSPR集成建模：稳定常数之间的关系

J Comput Aided Mol Des. 2014 May;28(5):549-64. doi: 10.1007/s10822-014-9741-3. Epub 2014 Apr 16.

Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.在线化学建模环境（OCHEM）：用于存储数据、开发模型以及发布化学信息的网络平台。

J Comput Aided Mol Des. 2011 Jun;25(6):533-54. doi: 10.1007/s10822-011-9440-2. Epub 2011 Jun 10.

Current mathematical methods used in QSAR/QSPR studies.当前在定量构效关系（QSAR）/定量构性关系（QSPR）研究中使用的数学方法。

Int J Mol Sci. 2009 Apr 29;10(5):1978-1998. doi: 10.3390/ijms10051978.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于离子载体金属络合定量结构-性质关系研究的线性和非线性方法的基准测试。

Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献