He Yan, Zhou Xibin, Yuan Fajie, Chang Xing
School of Medicine, Westlake University, Hangzhou, Zhejiang 310014, China; School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310014, China; Research Center for Industries of the Future (RCIF), Westlake University, Hangzhou, Zhejiang 310014, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang 310014, China; Westlake Center for Genome Editing, Westlake Laboratory of Life Sciences and Biomedicine, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China.
School of Engineering, Westlake University, Hangzhou, Zhejiang 310014, China.
STAR Protoc. 2024 Jul 12;5(3):103188. doi: 10.1016/j.xpro.2024.103188.
Protein language models (PLMs) are machine learning tools trained to predict masked amino acids within protein sequences, offering opportunities to enhance protein function without prior knowledge of their specific roles. Here, we present a protocol for optimizing thymine-DNA-glycosylase (TDG) using PLMs. We describe steps for "zero-shot" enzyme optimization, construction of plasmids, double plasmid transfection, and high-throughput sequencing and data analysis. This protocol holds promise for streamlining the engineering of gene editing tools, delivering improved activity while minimizing the experimental workload. For complete details on the use and execution of this protocol, please refer to He et al..
蛋白质语言模型(PLMs)是经过训练以预测蛋白质序列中被掩盖氨基酸的机器学习工具,为在不预先了解其特定作用的情况下增强蛋白质功能提供了机会。在此,我们展示了一种使用PLMs优化胸腺嘧啶-DNA-糖基化酶(TDG)的方案。我们描述了“零样本”酶优化、质粒构建、双质粒转染以及高通量测序和数据分析的步骤。该方案有望简化基因编辑工具的工程设计,在将实验工作量降至最低的同时提高活性。有关此方案的使用和执行的完整详细信息,请参考何等人的研究。