Generate Biomedicines, 26 Landsdowne Street, Cambridge, MA, 02139, USA.
MIT Departments of Biology and Biological Engineering, 77 Massachusetts Ave., Cambridge, MA, 02139, USA.
Curr Opin Struct Biol. 2021 Aug;69:63-69. doi: 10.1016/j.sbi.2021.03.009. Epub 2021 Apr 25.
Computational protein design can generate proteins not found in nature that adopt desired structures and perform novel functions. Although proteins could, in theory, be designed with ab initio methods, practical success has come from using large amounts of data that describe the sequences, structures, and functions of existing proteins and their variants. We present recent creative uses of multiple-sequence alignments, protein structures, and high-throughput functional assays in computational protein design. Approaches range from enhancing structure-based design with experimental data to building regression models to training deep neural nets that generate novel sequences. Looking ahead, deep learning will be increasingly important for maximizing the value of data for protein design.
计算蛋白质设计可以生成自然界中不存在的蛋白质,使其采用所需的结构并发挥新的功能。虽然理论上可以使用从头计算方法设计蛋白质,但实际成功来自于使用大量描述现有蛋白质及其变体的序列、结构和功能的数据。我们介绍了最近在计算蛋白质设计中使用多序列比对、蛋白质结构和高通量功能测定的创造性方法。这些方法的范围从使用实验数据增强基于结构的设计,到构建回归模型,再到训练生成新序列的深度神经网络。展望未来,深度学习对于最大限度地提高蛋白质设计中数据的价值将变得越来越重要。