Zhang Sheng, Chen Hang, Wan Yuxiang, Wang Haibin, Qu Haibin
Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China.
BioRay Pharmaceutical Co., Ltd., Taizhou 318000, China.
Pharmaceutics. 2024 Aug 17;16(8):1082. doi: 10.3390/pharmaceutics16081082.
The monoclonal antibody (mAb) manufacturing process comes with high profits and high costs, and thus mAb productivity is of vital importance. However, many factors can impact the cell culture process, and lead to mAb productivity reduction. Nowadays, the biopharma industry is actively employing manufacturing information systems, which enable the integration of both online data and offline data. Although the volume of data is large, related data mining studies for mAb productivity improvement are rare. Therefore, a data-driven approach is proposed in this study to leverage both the inline and offline data of the cell culture process to discover the causes of mAb productivity reduction. The approach consists of four steps, namely data preprocessing, phase division, feature extraction and fusion, and cluster comparing. First, data quality issues are solved during the data preprocessing step. Next, the inline data are divided into several phases based on the moving window -nearest neighbor method. Then, the inline data features are extracted via functional data analysis and combined with the offline data features. Finally, the causes of mAb productivity reduction are identified using the contrasting clusters via the principal component analysis method. A commercial-scale cell culture process case study is provided in this research to verify the effectiveness of the approach. Data from 35 batches were collected, and each batch contained nine inline variables and seven offline variables. The causes of mAb productivity reduction were identified to be the lack of nutrients, and recommended actions were taken according to the result, which was subsequently proven by six validation batches.
单克隆抗体(mAb)的生产过程利润高、成本也高,因此单克隆抗体的生产率至关重要。然而,许多因素会影响细胞培养过程,导致单克隆抗体生产率降低。如今,生物制药行业正积极采用制造信息系统,该系统能够整合在线数据和离线数据。尽管数据量很大,但针对提高单克隆抗体生产率的相关数据挖掘研究却很少。因此,本研究提出了一种数据驱动的方法,利用细胞培养过程中的在线和离线数据来发现单克隆抗体生产率降低的原因。该方法包括四个步骤,即数据预处理、阶段划分、特征提取与融合以及聚类比较。首先,在数据预处理步骤中解决数据质量问题。接下来,基于移动窗口最近邻方法将在线数据划分为几个阶段。然后,通过功能数据分析提取在线数据特征,并与离线数据特征相结合。最后,使用主成分分析方法通过对比聚类来确定单克隆抗体生产率降低的原因。本研究提供了一个商业规模细胞培养过程的案例研究,以验证该方法的有效性。收集了35个批次的数据,每个批次包含9个在线变量和7个离线变量。确定单克隆抗体生产率降低的原因是营养物质缺乏,并根据结果采取了建议措施,随后六个验证批次证明了该结果。