Kubik Łukasz, Kaliszan Roman, Wiczling Paweł
Department of Biopharmaceutics and Pharmacodynamics , Medical University of Gdańsk , Generała Józefa Hallera 107 , 80-416 Gdańsk , Poland.
Anal Chem. 2018 Nov 20;90(22):13670-13679. doi: 10.1021/acs.analchem.8b04033. Epub 2018 Oct 30.
The objective of this work was to develop a multilevel (hierarchical) model based on isocratic-reversed-phase-high-performance-chromatographic data collected in methanol and acetonitrile for 58 chemical compounds. Such a multilevel model is a regression model of the analyte-specific chromatographic measurements, in which all the regression parameters are given a probability model. It is a fundamentally different approach from the most common approach, where parameters are separately estimated for each analyte (without sharing information across analytes and different organic modifiers). The statistical analysis was done with Stan software implementing the Bayesian-statistics inference with Markov-chain Monte Carlo sampling. During the model-building process, a series of multilevel models of different complexity were obtained, such as (1) a model with no pooling (separate models were fitted for each analyte), (2) a model with partial pooling (a common distribution was used for analyte-specific parameters), and (3) a model with partial pooling as well as a regression model relating analyte-specific parameters and analyte-specific properties (QSRR equations). All the models were compared with each other using 10-fold cross-validation. The benefits of multilevel models in inference and predictions were shown. In particular the obtained models allowed us to (i) better understand the data and (ii) solve many routine analytical problems, such as obtaining well-calibrated predictions of retention factors for an analyte in acetonitrile-containing mobile phases given zero, one, or several measurements in methanol-containing mobile phases and vice versa.
这项工作的目标是基于在甲醇和乙腈中收集的58种化合物的等度反相高效液相色谱数据开发一个多层次(分层)模型。这样的多层次模型是分析物特定色谱测量的回归模型,其中所有回归参数都有一个概率模型。这是一种与最常见方法根本不同的方法,在最常见方法中,为每个分析物分别估计参数(不跨分析物和不同有机改性剂共享信息)。使用实施马尔可夫链蒙特卡罗采样的贝叶斯统计推断的Stan软件进行统计分析。在模型构建过程中,获得了一系列不同复杂度的多层次模型,例如:(1)无合并模型(为每个分析物拟合单独的模型);(2)部分合并模型(为分析物特定参数使用共同分布);(3)部分合并模型以及将分析物特定参数与分析物特定性质相关联的回归模型(QSRR方程)。使用10折交叉验证对所有模型进行相互比较。展示了多层次模型在推断和预测方面的优势。特别是所获得的模型使我们能够:(i)更好地理解数据;(ii)解决许多常规分析问题,例如在含甲醇的流动相中进行零次、一次或几次测量的情况下,获得含乙腈流动相中分析物保留因子的校准良好的预测,反之亦然。