Xiang Lang, Qu Pengfei
School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China.
School of Management Science and Engineering, Shandong Technology and Business University, Yantai 264005, China.
ACS Omega. 2025 Aug 8;10(32):35512-35527. doi: 10.1021/acsomega.4c11725. eCollection 2025 Aug 19.
This study presents a comprehensive application of integrated machine learning tools for modeling and optimizing the ibuprofen synthesis process. Initially, a database of 39,460 input combinations is created using chemical reaction theory and validated with experimental data. The CatBoost meta-model, optimized by the snow ablation optimizer, outperforms conventional algorithms in predicting reaction time, conversion rate, and production cost. Importance analyses through SHAP values identify critical input variables, notably, the concentration of the catalyst precursor (LPdCl), hydrogen ions (H), and water (HO), validating known catalytic principles and providing quantitative parameter guidance through data-driven analysis. Multiobjective optimization using NSGA-II generates a Pareto front of solutions, from which four industrial strategies are derived: balanced performance, maximum output, maximum yield, and minimum cost, each suitable for different production scenarios. The results identify optimal catalyst concentration ranges (0.002-0.01 mol/m) that achieve high conversion rates while maintaining low costs. Uncertainty analysis conducted through Monte Carlo simulation reveals that reaction time exhibits particularly high sensitivity to parameter fluctuations, with a distinctive nonlinear response peaking at moderate perturbation levels (σ = 0.3). This study provides valuable insights for the rational design of ibuprofen synthesis conditions and demonstrates the effectiveness of integrating physics-based modeling with machine learning for chemical process optimization.
本研究展示了集成机器学习工具在布洛芬合成过程建模与优化中的全面应用。最初,利用化学反应理论创建了一个包含39460种输入组合的数据库,并通过实验数据进行了验证。经雪消融优化器优化的CatBoost元模型在预测反应时间、转化率和生产成本方面优于传统算法。通过SHAP值进行的重要性分析确定了关键输入变量,特别是催化剂前体(LPdCl)、氢离子(H)和水(HO)的浓度,验证了已知的催化原理,并通过数据驱动分析提供了定量参数指导。使用NSGA-II进行的多目标优化生成了一个帕累托前沿解,从中得出了四种工业策略:平衡性能、最大产量、最大产率和最低成本,每种策略适用于不同的生产场景。结果确定了能在保持低成本的同时实现高转化率的最佳催化剂浓度范围(0.002 - 0.01 mol/m)。通过蒙特卡罗模拟进行的不确定性分析表明,反应时间对参数波动表现出特别高的敏感性,在中等扰动水平(σ = 0.3)时具有独特的非线性响应峰值。本研究为布洛芬合成条件的合理设计提供了有价值的见解,并证明了将基于物理的建模与机器学习相结合用于化学过程优化的有效性。