School of Electrical and Information Engineering, JiangSu University, ZhenJiang, 212013, JiangSu, China.
BMC Biotechnol. 2023 Nov 8;23(1):49. doi: 10.1186/s12896-023-00816-3.
A method combining offline techniques and the just-in-time learning strategy (JITL) is proposed, because the biochemical reaction process often encounters changing features and parameters over time.
Firstly, multiple sub-databases in the fermentation process are constructed offline by an improved fuzzy C-means algorithm and the sample data are adaptively pruned by a similarity query threshold. Secondly, an improved eXtreme Gradient Boosting (XGBoost) method is used on the online modeling stage to build soft sensor models, and the multi-similarity-driven just-in-time learning strategy is used to increase the diversity of the model. Finally, to improve the generalization of the whole algorithm, the output of the base learner is fused by an improved Stacking integration model and then the predictive output is performed.
Applying the constructed soft sensor model to the problem of predicting cell concentration and product concentration in Pichia pastoris fermentation process. The experimental results show that the root mean square error of the cell concentration is 0.0260, the coefficient of determination is 0.9945, the root mean square error of the product concentration is 2.6688, and the coefficient of determination is 0.9970. It shows that the proposed method has the advantages of timely prediction and high prediction accuracy, which validates the effectiveness and practicality of the method.
The JS-ISSA-XGBoost is an extensive and excellent soft measurement model that meets the practical needs for real-time monitoring of parameters and prediction of control in biochemical reactions.
提出了一种结合离线技术和即时学习策略(JITL)的方法,因为生化反应过程通常会随着时间的推移而遇到特征和参数的变化。
首先,通过改进的模糊 C 均值算法在线下构建发酵过程中的多个子数据库,并通过相似度查询阈值自适应地修剪样本数据。其次,在线建模阶段采用改进的极端梯度提升(XGBoost)方法构建软传感器模型,并采用多相似度驱动的即时学习策略来增加模型的多样性。最后,为了提高整个算法的泛化能力,通过改进的堆叠集成模型融合基学习器的输出,然后进行预测输出。
将构建的软传感器模型应用于毕赤酵母发酵过程中细胞浓度和产物浓度的预测问题。实验结果表明,细胞浓度的均方根误差为 0.0260,决定系数为 0.9945,产物浓度的均方根误差为 2.6688,决定系数为 0.9970。这表明所提出的方法具有及时预测和高预测精度的优点,验证了该方法的有效性和实用性。
JS-ISSA-XGBoost 是一种广泛而优秀的软测量模型,满足了生化反应中实时监测参数和预测控制的实际需求。