Suppr超能文献

[醛酮类化合物色谱保留指数的集成全息定量构效关系模型]

[Ensemble hologram quantitative structure activity relationship model of the chromatographic retention index of aldehydes and ketones].

作者信息

Lei Bin, Zang Yunlei, Xue Zhiwei, Ge Yiqing, Li Wei, Zhai Qian, Jiao Long

机构信息

College of Chemistry and Chemical Engineering, Xi'an Shiyou University, Xi'an 710065, China.

No. 203 Research Institute of Nuclear Industry, Xianyang 712000, China.

出版信息

Se Pu. 2021 Mar;39(3):331-337. doi: 10.3724/SP.J.1123.2020.06011.

Abstract

Chromatographic retention index (RI) is an important parameter for describing the retention behavior of substances in chromatographic analysis. Experimentally determining the RI values of different aldehyde and ketone compounds in all kinds of polar stationary phases is expensive and time consuming. Quantitative structure activity relationship (QSAR) is an important chemometric technique that has been widely used to correlate the properties of chemicals to their molecular structures. Irrespective of whether the properties of a molecule have been experimentally determined, they can be calculated using QSAR models. It is therefore necessary and advisable to establish the QSAR model for predicting the RI value of aldehydes and ketones. Hologram QSAR (HQSAR) is a highly efficient QSAR approach that can easily generate QSAR models with good statistics and high prediction accuracy. A specific fragment of fingerprint, known as a molecular hologram, is proposed in the HQSAR approach and used as a structural descriptor to build the proposed QSAR model. In general, individual HQSAR models are built in QSAR researches. However, individual QSAR models are usually affected by underfitting and overfitting. The ensemble modeling method, which integrate several individual models through certain consensus strategies, can overcome the shortcomings of individual models. It is worth studying whether ensemble modeling can improve the prediction ability of the HQSAR method in order to build more accurate and reliable QSAR models. Therefore, this study investigates the QSAR model for chromatographic RI of aldehydes and ketones using ensemble modeling and the HQSAR method. Two individual HQSAR models comprising 34 compounds in two stationary phases, DB-210 and HP-Innowax, were established. The prediction ability of the two established models was assessed by external test set validation and leave-one-out cross validation (LOO-CV). The investigated 34 compounds were randomly assigned into two groups. Group Ⅰ comprised 26 compounds, and Group Ⅱ comprised 8 compounds. In the validation of the external test set, Group Ⅰ was employed to manually optimize the two fragment parameters (fragment distinction (FD) and fragment size (FS)) and build the HQSAR models. Group Ⅱ was used as the test set to assess the predictive performance of the developed models. For the DB-210 stationary phase, the optimal individual HQSAR model was obtained while setting the FD and FS to "donor/acceptor atoms (DA)" and 1-9, respectively. For the HP-Innowax stationary phase, the optimal individual HQSAR model was obtained by setting the FD and FS to "DA" and 4-7 respectively. The squared correlation coefficient of cross validation ( [Formula: see text] for predicting the RI values of the DB-210 and HP-Innowax stationary phases were 0.927 and 0.919, 0.956 and 0.979, 0.929 and 0.963, 0.927 and 0.958, and 0.935 and 0.963, respectively. Compared to the individual HQSAR models, the established ensemble HQSAR models show better robustness and accuracy, thus establishing that ensemble modeling is an effective approach. The combination of HQSAR and the ensemble modeling method is a practicable and promising method for studying and predicting the RI values of aldehydes and ketones.

摘要

色谱保留指数(RI)是描述物质在色谱分析中保留行为的重要参数。通过实验测定各种极性固定相中不同醛酮化合物的RI值既昂贵又耗时。定量构效关系(QSAR)是一种重要的化学计量学技术,已被广泛用于将化学物质的性质与其分子结构相关联。无论分子的性质是否已通过实验测定,都可以使用QSAR模型进行计算。因此,建立用于预测醛酮RI值的QSAR模型是必要且可行的。全息定量构效关系(HQSAR)是一种高效的QSAR方法,能够轻松生成具有良好统计性和高预测准确性的QSAR模型。HQSAR方法中提出了一种特定的指纹片段,称为分子全息图,并用作构建所提出的QSAR模型的结构描述符。一般来说,在QSAR研究中构建的是单个HQSAR模型。然而,单个QSAR模型通常受到欠拟合和过拟合的影响。通过特定的共识策略整合多个个体模型的集成建模方法可以克服个体模型的缺点。研究集成建模是否可以提高HQSAR方法的预测能力以构建更准确可靠的QSAR模型是值得的。因此,本研究使用集成建模和HQSAR方法研究醛酮色谱RI的QSAR模型。建立了两个包含34种化合物在DB - 210和HP - Innowax两种固定相中的单个HQSAR模型。通过外部测试集验证和留一法交叉验证(LOO - CV)评估所建立的两个模型的预测能力。将所研究的34种化合物随机分为两组。第一组包含26种化合物,第二组包含8种化合物。在外部测试集验证中,第一组用于手动优化两个片段参数(片段区分(FD)和片段大小(FS))并构建HQSAR模型。第二组用作测试集以评估所开发模型的预测性能。对于DB - 210固定相,分别将FD和FS设置为“供体/受体原子(DA)”和l - 9时获得了最佳单个HQSAR模型。对于HP - Innowax固定相,分别将FD和FS设置为“DA”和4 - 7时获得了最佳单个HQSAR模型。预测DB - 210和HP - Innowax固定相RI值的交叉验证平方相关系数([公式:见文本])分别为0.927和0.919、0.956和0.979、0.929和0.963、0.927和0.958以及0.935和0.963。与单个HQSAR模型相比,所建立的集成HQSAR模型显示出更好的稳健性和准确性,从而证明集成建模是一种有效的方法。HQSAR与集成建模方法的结合是研究和预测醛酮RI值的一种可行且有前景的方法。

相似文献

4
CoMFA and HQSAR studies on 6,7-dimethoxy-4-pyrrolidylquinazoline derivatives as phosphodiesterase10A inhibitors.
Bioorg Med Chem. 2008 Apr 1;16(7):3675-86. doi: 10.1016/j.bmc.2008.02.013. Epub 2008 Feb 8.
8
Holographic QSAR of selected esters.
Chemosphere. 2004 Dec;57(11):1739-45. doi: 10.1016/j.chemosphere.2004.08.075.
9
Insights into the permeability of drugs and drug-like molecules from MI-QSAR and HQSAR studies.
J Mol Model. 2012 Mar;18(3):947-62. doi: 10.1007/s00894-011-1121-5. Epub 2011 Jun 3.
10
Convergent QSAR Models for the Prediction of Cruzain Inhibitors.
ACS Omega. 2023 Oct 13;8(42):38961-38982. doi: 10.1021/acsomega.3c03376. eCollection 2023 Oct 24.

本文引用的文献

3
HQSAR and random forest-based QSAR models for anti-T. vaginalis activities of nitroimidazoles derivatives.
J Mol Graph Model. 2019 Jul;90:180-191. doi: 10.1016/j.jmgm.2019.04.007. Epub 2019 Apr 19.
4
Methodology of aiQSAR: a group-specific approach to QSAR modelling.
J Cheminform. 2019 Apr 3;11(1):27. doi: 10.1186/s13321-019-0350-y.
5
Combined HQSAR, topomer CoMFA, homology modeling and docking studies on triazole derivatives as SGLT2 inhibitors.
Future Med Chem. 2017 Jun;9(9):847-858. doi: 10.4155/fmc-2017-0002. Epub 2017 Jun 21.
8
A QSAR model of HERG binding using a large, diverse, and internally consistent training set.
Chem Biol Drug Des. 2006 Apr;67(4):284-96. doi: 10.1111/j.1747-0285.2006.00379.x.
9
Ensemble of linear models for predicting drug properties.
J Chem Inf Model. 2006 Jan-Feb;46(1):416-23. doi: 10.1021/ci050375+.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验