Taraji Maryam, Haddad Paul R, Amos Ruth I J, Talebi Mohammad, Szucs Roman, Dolan John W, Pohl Christopher A
Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia.
Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia.
J Chromatogr A. 2017 Feb 24;1486:59-67. doi: 10.1016/j.chroma.2016.12.025. Epub 2016 Dec 14.
Quantitative structure-retention relationship (QSRR) models are developed to predict the retention times of analytes on five hydrophilic interaction liquid chromatography (HILIC) stationary phases (bare silica, amine, amide, diol and zwitterionic), with a view to selecting the most suitable stationary phase(s) for the separation of these analytes. The study was conducted using six β-adrenergic agonists as target analytes. Molecular descriptors were calculated based only on chemical structures optimized using density functional theory. A genetic algorithm (GA) was then used to select the most relevant molecular descriptors and these were used to build a retention model for each stationary phase using partial least squares (PLS) regression. This model was then used to predict the retention of the test set of target analytes. This process created an optimized descriptor set which enhanced the reliability of the developed QSRR models. Finally, the QSRR models developed in the work were utilized to provide some insight into the separation mechanisms operating in the HILIC mode. Three performance criteria - mean absolute error (MAE), root mean square error of prediction scaled to retention time (RMSEP), and the number of selected descriptors, were used to evaluate the developed models when applied to an external test set of six β-adrenergic agonists and showed highly predictive abilities. MAE values ranged from 13 to 25s on four of the stationary phases, with a somewhat higher error (50s) being observed for the zwitterionic phase. RMSEP values of 4.88-11.12% were recorded. Validation was performed through Y-randomization and chemical domain applicability, from which it was evident that the developed optimized GA-PLS models were robust. The high levels of accuracy, reliability and applicability of the models were to a large extent due to the optimization of the GA descriptor set and the presence of relevant structural and geometric molecular descriptors, together with descriptors based on important physicochemical properties, which establish a strong connection between retention time and meaningful chemical properties. The present strategy, while it is a pilot study, holds great promise for broader screening of HILIC stationary phases for desired separation, as well as for acquisition of information about molecular mechanisms of separation under chromatographic conditions.
定量结构-保留关系(QSRR)模型的建立是为了预测分析物在五种亲水作用液相色谱(HILIC)固定相(裸硅胶、胺基、酰胺基、二醇基和两性离子型)上的保留时间,以便为这些分析物的分离选择最合适的固定相。该研究以六种β-肾上腺素能激动剂作为目标分析物进行。分子描述符仅基于使用密度泛函理论优化的化学结构进行计算。然后使用遗传算法(GA)选择最相关的分子描述符,并使用这些描述符通过偏最小二乘(PLS)回归为每个固定相建立保留模型。然后使用该模型预测目标分析物测试集的保留情况。这一过程创建了一个优化的描述符集,提高了所开发的QSRR模型的可靠性。最后,利用该研究中开发的QSRR模型对HILIC模式下的分离机制提供一些见解。当应用于六种β-肾上腺素能激动剂的外部测试集时,使用三个性能标准——平均绝对误差(MAE)、按保留时间缩放的预测均方根误差(RMSEP)和所选描述符的数量——来评估所开发的模型,结果显示这些模型具有高度的预测能力。在四种固定相上,MAE值范围为13至25秒,两性离子型固定相的误差略高(50秒)。记录的RMSEP值为4.88 - 11.12%。通过Y随机化和化学域适用性进行了验证,由此可见所开发的优化GA-PLS模型是稳健的。这些模型的高精度、可靠性和适用性在很大程度上归因于GA描述符集的优化以及相关结构和几何分子描述符的存在,以及基于重要物理化学性质的描述符,这些描述符在保留时间和有意义的化学性质之间建立了紧密联系。本策略虽然是一项初步研究,但对于更广泛地筛选用于所需分离的HILIC固定相以及获取色谱条件下分离的分子机制信息具有很大的前景。