Department of Biotechnology, Delft University of Technology, Delft, The Netherlands.
GSK, Technical Research & Development - Microbial Drug Substance, Rixensart, Belgium.
Biotechnol J. 2024 Mar;19(3):e2300708. doi: 10.1002/biot.202300708.
Protein-based biopharmaceuticals require high purity before final formulation to ensure product safety, making process development time consuming. Implementation of computational approaches at the initial stages of process development offers a significant reduction in development efforts. By preselecting process conditions, experimental screening can be limited to only a subset. One such computational selection approach is the application of Quantitative Structure Property Relationship (QSPR) models that describe the properties exploited during purification. This work presents a novel open-source Python tool capable of extracting a range of features from protein 3D models on a local computer allowing total transparency of the calculations. As open-source tool, it also impacts initial investments in constructing a QSPR workflow for protein property prediction for third parties, making it widely applicable within the field of bioprocess development. The focus of current calculated molecular features is projection onto the protein surface by constructing surface grid representations. Linear regression models were trained with the calculated features to predict chromatographic retention times/volumes. Model validation shows a high accuracy for anion and cation exchange chromatography data (cross-validated R of 0.87 and 0.95). Hence, these models demonstrate the potential of the use of QSPR to accelerate process design.
蛋白质类生物制药在最终配方前需要高度的纯度以确保产品安全,这使得工艺开发过程耗时耗力。在工艺开发的初始阶段实施计算方法可以显著减少开发工作。通过预先选择工艺条件,可以将实验筛选限制在仅选择一小部分条件。一种这样的计算选择方法是应用定量构效关系(QSPR)模型,该模型描述了在纯化过程中利用的性质。这项工作提出了一种新颖的开源 Python 工具,能够从本地计算机上的蛋白质 3D 模型中提取一系列特征,从而实现计算过程的完全透明。作为开源工具,它还降低了为第三方构建蛋白质性质预测 QSPR 工作流程的初始投资成本,使其在生物工艺开发领域得到广泛应用。目前计算的分子特征的重点是通过构建表面网格表示来投射到蛋白质表面上。用计算出的特征训练线性回归模型,以预测色谱保留时间/体积。模型验证表明,对于阴离子和阳离子交换色谱数据具有很高的准确性(交叉验证的 R 值分别为 0.87 和 0.95)。因此,这些模型展示了使用 QSPR 加速工艺设计的潜力。