Park Soo Hyun, Talebi Mohammad, Amos Ruth I J, Tyteca Eva, Haddad Paul R, Szucs Roman, Pohl Christopher A, Dolan John W
Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart, 7001, Australia.
Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart, 7001, Australia.
J Chromatogr A. 2017 Nov 10;1523:173-182. doi: 10.1016/j.chroma.2017.02.054. Epub 2017 Feb 24.
Quantitative Structure-Retention Relationships (QSRR) are used to predict retention times of compounds based only on their chemical structures encoded by molecular descriptors. The main concern in QSRR modelling is to build models with high predictive power, allowing reliable retention prediction for the unknown compounds across the chromatographic space. With the aim of enhancing the prediction power of the models, in this work, our previously proposed QSRR modelling approach called "federation of local models" is extended in ion chromatography to predict retention times of unknown ions, where a local model for each target ion (unknown) is created using only structurally similar ions from the dataset. A Tanimoto similarity (TS) score was utilised as a measure of structural similarity and training sets were developed by including ions that were similar to the target ion, as defined by a threshold value. The prediction of retention parameters (a- and b-values) in the linear solvent strength (LSS) model in ion chromatography, log k=a - blog[eluent], allows the prediction of retention times under all eluent concentrations. The QSRR models for a- and b-values were developed by a genetic algorithm-partial least squares method using the retention data of inorganic and small organic anions and larger organic cations (molecular mass up to 507) on four Thermo Fisher Scientific columns (AS20, AS19, AS11HC and CS17). The corresponding predicted retention times were calculated by fitting the predicted a- and b-values of the models into the LSS model equation. The predicted retention times were also plotted against the experimental values to evaluate the goodness of fit and the predictive power of the models. The application of a TS threshold of 0.6 was found to successfully produce predictive and reliable QSRR models (Q>0.8 and Mean Absolute Error<0.1), and hence accurate retention time predictions with an average Mean Absolute Error of 0.2min.
定量结构-保留关系(QSRR)用于仅根据由分子描述符编码的化合物化学结构来预测其保留时间。QSRR建模的主要关注点是构建具有高预测能力的模型,以便在整个色谱空间中对未知化合物进行可靠的保留预测。为了提高模型的预测能力,在本工作中,我们先前提出的称为“局部模型联合”的QSRR建模方法在离子色谱中得到扩展,以预测未知离子的保留时间,其中使用数据集中仅结构相似的离子为每个目标离子(未知)创建一个局部模型。使用Tanimoto相似性(TS)分数作为结构相似性的度量,并通过纳入与目标离子相似的离子(由阈值定义)来开发训练集。离子色谱中线性溶剂强度(LSS)模型(log k = a - blog[洗脱液])中保留参数(a值和b值)的预测允许在所有洗脱液浓度下预测保留时间。通过遗传算法-偏最小二乘法,利用四种赛默飞世尔科技色谱柱(AS20、AS19、AS11HC和CS17)上无机和小有机阴离子以及较大有机阳离子(分子量高达507)的保留数据,建立了a值和b值的QSRR模型。通过将模型预测的a值和b值代入LSS模型方程来计算相应的预测保留时间。还将预测保留时间与实验值作图,以评估模型的拟合优度和预测能力。发现应用0.6的TS阈值可成功产生预测性且可靠的QSRR模型(Q>0.8且平均绝对误差<0.1),因此平均绝对误差为0.2分钟时可进行准确的保留时间预测。