Wiczling Paweł, Kamedulska Agnieszka, Kubik Łukasz
Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera 107, 80-416 Gdańsk, Poland.
Anal Chem. 2021 May 11;93(18):6961-6971. doi: 10.1021/acs.analchem.0c05227. Epub 2021 Apr 27.
Quantitative structure-retention relationships (QSRRs) are used in the field of chromatography to model the relationship between an analyte structure and chromatographic retention. Such models are typically difficult to build and validate for heterogeneous compounds because of their many descriptors and relatively limited analyte-specific data. In this study, a Bayesian multilevel model is proposed to characterize the isocratic retention time data collected for 1026 heterogeneous analytes. The QSRR considers the effects of the molecular mass and 100 functional groups (substituents) on analyte-specific chromatographic parameters of the Neue model (i.e., the retention factor in water, the retention factor in acetonitrile, and the curvature coefficient). A Bayesian multilevel regression model was used to smooth noisy parameter estimates with too few data and to consider the uncertainties in the model parameters. We discuss the benefits of the Bayesian multilevel model (i) to understand chromatographic data, (ii) to quantify the effect of functional groups on chromatographic retention, and (iii) to predict analyte retention based on various types of preliminary data. The uncertainty of isocratic and gradient predictions was visualized using uncertainty chromatograms and discussed in terms of usefulness in decision making. We think that this method will provide the most benefit in providing a unified scheme for analyzing large chromatographic databases and assessing the impact of functional groups and other descriptors on analyte retention.
定量结构-保留关系(QSRRs)在色谱领域用于建立分析物结构与色谱保留之间关系的模型。由于其众多描述符和相对有限的分析物特定数据,此类模型通常难以针对多相化合物构建和验证。在本研究中,提出了一种贝叶斯多级模型来表征为1026种多相分析物收集的等度保留时间数据。该QSRR考虑了分子量和100个官能团(取代基)对Neue模型的分析物特定色谱参数(即水中的保留因子、乙腈中的保留因子和曲率系数)的影响。使用贝叶斯多级回归模型对数据过少时产生的有噪声的参数估计进行平滑处理,并考虑模型参数中的不确定性。我们讨论了贝叶斯多级模型的优点:(i)理解色谱数据;(ii)量化官能团对色谱保留的影响;(iii)基于各种类型的初步数据预测分析物保留。使用不确定性色谱图可视化等度和梯度预测的不确定性,并从决策有用性的角度进行了讨论。我们认为,该方法将在为分析大型色谱数据库以及评估官能团和其他描述符对分析物保留的影响提供统一方案方面提供最大益处。