Suppr超能文献

通过机器学习进行的计算机辅助高效液相色谱方法开发

In Silico High-Performance Liquid Chromatography Method Development via Machine Learning.

作者信息

Marchetto Alberto, Tirapelle Monica, Mazzei Luca, Sorensen Eva, Besenhard Maximilian O

机构信息

Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE, U.K.

Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Via Raffaele Lambruschini 4/B, Milano 20156, Italy.

出版信息

Anal Chem. 2025 Apr 8;97(13):6991-7001. doi: 10.1021/acs.analchem.4c03466. Epub 2025 Mar 28.

Abstract

High-performance liquid chromatography (HPLC) remains the gold standard for analyzing and purifying molecular components in solutions. However, developing HPLC methods is material- and time-consuming, so computer-aided shortcuts are highly desirable. In line with the digitalization of process development and the growth of HPLC databases, we propose a data-driven methodology to predict molecule retention factors as a function of mobile phase composition without the need for any new experiments, solely relying on molecular descriptors (MDs) obtained via simplified molecular input line entry system (SMILES) string representations of molecules. This new approach combines: (a) quantitative structure-property relationships (QSPR) using MDs to predict solute-dependent parameters in (b) linear solvation energy relationships (LSER) and (c) linear solvent strength (LSS) theory. We demonstrate the potential of this computational methodology using experimental data for retention factors of small molecules made available by the research community for which the MDs were obtained via SMILES string representations determined by the structural formulas of the molecules. This method can be adopted directly to predict elution times of molecular components; however, in combination with first-principle-based mechanistic transport models, the method can also be employed to optimize HPLC methods in-silico. Both options can reduce the experimental load and accelerate HPLC method development significantly, lowering the time and cost of the drug manufacturing cycle and reducing the time to market. Given the growing number and quality of HPLC databases, the predictive power of this methodology will only increase in the coming years.

摘要

高效液相色谱法(HPLC)仍然是分析和纯化溶液中分子成分的金标准。然而,开发HPLC方法既耗费材料又耗时,因此非常需要计算机辅助的捷径。随着过程开发的数字化和HPLC数据库的增长,我们提出了一种数据驱动的方法,无需进行任何新的实验,仅依靠通过分子的简化分子输入线性输入系统(SMILES)字符串表示获得的分子描述符(MDs),来预测分子保留因子作为流动相组成的函数。这种新方法结合了:(a)使用MDs的定量结构-性质关系(QSPR)来预测(b)线性溶剂化能关系(LSER)和(c)线性溶剂强度(LSS)理论中的溶质相关参数。我们使用研究团体提供的小分子保留因子的实验数据来证明这种计算方法的潜力,这些小分子的MDs是通过由分子结构式确定的SMILES字符串表示获得的。这种方法可以直接用于预测分子成分的洗脱时间;然而,与基于第一原理的机理传输模型相结合,该方法还可用于在计算机上优化HPLC方法。这两种选择都可以减少实验工作量并显著加速HPLC方法的开发,降低药物制造周期的时间和成本,并缩短上市时间。鉴于HPLC数据库的数量和质量不断增加,这种方法的预测能力在未来几年只会增强。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3ec/11983366/4ef52e889cb3/ac4c03466_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验