Department of Chemical Engineering and Materials Science, University of Minnesota, Twin Cities, Minneapolis, MN 55455, United States.
Protein Eng Des Sel. 2024 Jan 29;37. doi: 10.1093/protein/gzae010.
Protein developability is requisite for use in therapeutic, diagnostic, or industrial applications. Many developability assays are low throughput, which limits their utility to the later stages of protein discovery and evolution. Recent approaches enable experimental or computational assessment of many more variants, yet the breadth of applicability across protein families and developability metrics is uncertain. Here, three library-scale assays-on-yeast protease, split green fluorescent protein (GFP), and non-specific binding-were evaluated for their ability to predict two key developability outcomes (thermal stability and recombinant expression) for the small protein scaffolds affibody and fibronectin. The assays' predictive capabilities were assessed via both linear correlation and machine learning models trained on the library-scale assay data. The on-yeast protease assay is highly predictive of thermal stability for both scaffolds, and the split-GFP assay is informative of affibody thermal stability and expression. The library-scale data was used to map sequence-developability landscapes for affibody and fibronectin binding paratopes, which guides future design of variants and libraries.
蛋白质的可开发性是其在治疗、诊断或工业应用中的必要条件。许多可开发性测定方法的通量较低,这限制了它们在蛋白质发现和进化的后期阶段的应用。最近的方法可以对更多的变体进行实验或计算评估,但蛋白质家族和可开发性指标的适用范围尚不确定。在这里,三种基于酵母蛋白酶、分裂绿色荧光蛋白(GFP)和非特异性结合的文库规模测定方法,用于预测小分子蛋白支架亲和体和纤维连接蛋白的两个关键可开发性结果(热稳定性和重组表达)。通过基于文库规模测定数据的线性相关和机器学习模型,评估了这些测定方法的预测能力。酵母蛋白酶测定法对两种支架的热稳定性都具有高度的预测性,而分裂 GFP 测定法对亲和体的热稳定性和表达情况具有信息性。文库规模的数据用于绘制亲和体和纤维连接蛋白结合表位的序列可开发性图谱,指导未来变体和文库的设计。