Laboratory for Biomolecular Nanotechnology, Department of Biosciences, School of Natural Sciences Technical University of Munich, 85748 Garching, Germany.
Munich Institute of Biomedical Engineering, Technical University of Munich, 85748 Garching, Germany.
Science. 2024 Oct 25;386(6720):439-445. doi: 10.1126/science.adq1741. Epub 2024 Oct 24.
Machine learning (ML)-based design approaches have advanced the field of de novo protein design, with diffusion-based generative methods increasingly dominating protein design pipelines. Here, we report a "hallucination"-based protein design approach that functions in relaxed sequence space, enabling the efficient design of high-quality protein backbones over multiple scales and with broad scope of application without the need for any form of retraining. We experimentally produced and characterized more than 100 proteins. Three high-resolution crystal structures and two cryo-electron microscopy density maps of designed single-chain proteins comprising up to 1000 amino acids validate the accuracy of the method. Our pipeline can also be used to design synthetic protein-protein interactions, as validated experimentally by a set of protein heterodimers. Relaxed sequence optimization offers attractive performance with respect to designability, scope of applicability for different design problems, and scalability across protein sizes.
基于机器学习 (ML) 的设计方法推动了从头蛋白质设计领域的发展,基于扩散的生成方法越来越主导蛋白质设计流程。在这里,我们报告了一种基于“幻觉”的蛋白质设计方法,该方法在宽松的序列空间中起作用,能够在多个尺度上高效设计高质量的蛋白质骨架,并具有广泛的应用范围,而无需任何形式的再训练。我们通过实验生产和表征了 100 多种蛋白质。三个高分辨率的晶体结构和两个设计的包含多达 1000 个氨基酸的单链蛋白质的冷冻电子显微镜密度图验证了该方法的准确性。我们的流水线还可以用于设计合成蛋白质-蛋白质相互作用,这通过一组蛋白质异二聚体的实验得到了验证。松弛序列优化在设计性、不同设计问题的适用范围以及蛋白质大小的可扩展性方面具有吸引力。