Jo Se-Hee, Lee Jina, Won Wangyun, Kim Jun-Woo
CJ Blossom Park, CJ BIO Research Institute, 55, Gwanggyo-ro 42beon-gil, Yeongtong-gu, Suwon-Si, Gyeonggi-do 16495, Republic of Korea.
Department of Chemical and Biological Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea.
ACS Omega. 2025 Jan 13;10(3):2949-2957. doi: 10.1021/acsomega.4c09246. eCollection 2025 Jan 28.
A major challenge in bioprocess simulation is the lack of physical and chemical property databases for biochemicals. A Python-based algorithm was developed for estimating the nonrandom two-liquid (NRTL) model parameters of aqueous binary systems in a straightforward manner from simplified molecular-input line-entry specification (SMILES) strings of substances in a system. This algorithm conducts a series of procedures: (1) fragmentation of the molecules into functional groups from SMILES, (2) calculation of activity coefficients under predetermined temperature and mole fraction conditions by employing universal quasi-chemical functional group activity coefficient (UNIFAC) model, and (3) regression of NRTL model parameters by employing UNIFAC model simulation results in the differential evolution algorithm (DEA) and Nelder-Mead method (NMM). The algorithm was applied to aqueous, binary mixture systems composed of 37 common biochemical substances such as amino acids, organic acids, and sugars. The obtained NRTL parameters were compared with those from Aspen Plus, a commercial software, which has an equivalent function for estimating the NRTL parameters. The percentage mean absolute residuals of the activity coefficients obtained using DEA, NMM, and the parameter estimation tool in Aspen Plus were in the ranges of 0.05-16.69, 0.05-16.69, and 0.09-326.77%, respectively. This in-house algorithm will be helpful for obtaining more accurate NRTL parameters in a timely manner and will facilitate the simulation of biochemical processes for process optimization, energy consumption estimation, and life cycle assessment.
生物过程模拟中的一个主要挑战是缺乏生化物质的物理和化学性质数据库。开发了一种基于Python的算法,用于从系统中物质的简化分子输入线性条目规范(SMILES)字符串直接估算二元水体系的非随机双液体(NRTL)模型参数。该算法进行一系列步骤:(1)从SMILES中将分子拆分为官能团;(2)通过采用通用准化学官能团活度系数(UNIFAC)模型,在预定温度和摩尔分数条件下计算活度系数;(3)通过在差分进化算法(DEA)和Nelder-Mead方法(NMM)中采用UNIFAC模型模拟结果对NRTL模型参数进行回归。该算法应用于由37种常见生化物质(如氨基酸、有机酸和糖类)组成的二元水混合体系。将获得的NRTL参数与商业软件Aspen Plus的参数进行比较,Aspen Plus具有估算NRTL参数的等效功能。使用DEA、NMM和Aspen Plus中的参数估计工具获得的活度系数的平均绝对残差百分比分别在0.05 - 16.69%、0.05 - 16.69%和0.09 - 326.77%范围内。这种内部算法将有助于及时获得更准确的NRTL参数,并将促进生化过程的模拟,以进行过程优化、能耗估算和生命周期评估。