Landwehr Grant M, Bogart Jonathan W, Magalhaes Carol, Hammarlund Eric G, Karim Ashty S, Jewett Michael C
Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
Nat Commun. 2025 Jan 20;16(1):865. doi: 10.1038/s41467-024-55399-0.
Enzyme engineering is limited by the challenge of rapidly generating and using large datasets of sequence-function relationships for predictive design. To address this challenge, we develop a machine learning (ML)-guided platform that integrates cell-free DNA assembly, cell-free gene expression, and functional assays to rapidly map fitness landscapes across protein sequence space and optimize enzymes for multiple, distinct chemical reactions. We apply this platform to engineer amide synthetases by evaluating substrate preference for 1217 enzyme variants in 10,953 unique reactions. We use these data to build augmented ridge regression ML models for predicting amide synthetase variants capable of making 9 small molecule pharmaceuticals. Over these nine compounds, ML-predicted enzyme variants demonstrate 1.6- to 42-fold improved activity relative to the parent. Our ML-guided, cell-free framework promises to accelerate enzyme engineering by enabling iterative exploration of protein sequence space to build specialized biocatalysts in parallel.
酶工程受到快速生成和利用大量序列-功能关系数据集进行预测性设计这一挑战的限制。为应对这一挑战,我们开发了一个机器学习(ML)引导的平台,该平台整合了无细胞DNA组装、无细胞基因表达和功能测定,以快速绘制跨越蛋白质序列空间的适应性景观,并针对多种不同的化学反应优化酶。我们将这个平台应用于工程化酰胺合成酶,通过评估10953个独特反应中1217个酶变体的底物偏好。我们利用这些数据构建增强岭回归ML模型,以预测能够合成9种小分子药物的酰胺合成酶变体。在这9种化合物中,ML预测的酶变体相对于亲本表现出1.6至42倍的活性提高。我们的ML引导的无细胞框架有望通过能够对蛋白质序列空间进行迭代探索,并行构建专门的生物催化剂,从而加速酶工程。