通过构象景观优化进行蛋白质序列设计。

Protein sequence design by conformational landscape optimization.

机构信息

Department of Biochemistry, University of Washington, Seattle, WA 98105.

Institute for Protein Design, University of Washington, Seattle, WA 98105.

出版信息

Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.

DOI:10.1073/pnas.2017228118

PMID:33712545

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7980421/

Abstract

The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen's thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure.

摘要

蛋白质设计问题是确定一个氨基酸序列，使其折叠成所需的结构。根据 Anfinsen 的折叠热力学假设，这可以重新表述为找到一个氨基酸序列，其中所需的结构是最低能量状态。由于这个计算不仅涉及所有可能的氨基酸序列，还涉及所有可能的结构，因此目前大多数方法都集中在寻找所需结构的最低能量氨基酸序列的更可处理的问题上，通常在第二步通过蛋白质结构预测来检查所需的结构是否确实是设计序列的最低能量构象，并且通常会丢弃很大一部分不满足此条件的设计序列。在这里，我们表明，通过从所需结构向后传播梯度到输入的氨基酸序列，我们可以在单个计算中直接优化所有可能的氨基酸序列和所有可能的结构。我们发现，考虑到完整构象景观的 trRosetta 结构预测网络计算可以比 Rosetta 单点能量估计更有效地预测从头设计的蛋白质的折叠和稳定性。我们将构象景观优化的序列设计与 Rosetta 中的标准基于能量的序列设计方法进行比较，并表明前者可以产生具有更少替代能量最小值的能量景观。我们进一步表明，通过结合两种方法的优势，可以设计出更具漏斗形的能量景观：低分辨率的 trRosetta 模型有助于不利替代状态，而高分辨率的 Rosetta 模型有助于在设计目标结构处创建深的能量最低点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a70/7980421/9144eac3772d/pnas.2017228118fig01.jpg

相似文献

Protein sequence design by conformational landscape optimization.通过构象景观优化进行蛋白质序列设计。

Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.

The trRosetta server for fast and accurate protein structure prediction.TrRosetta 服务器：用于快速准确的蛋白质结构预测。

Nat Protoc. 2021 Dec;16(12):5634-5651. doi: 10.1038/s41596-021-00628-9. Epub 2021 Nov 10.

Modeling disordered regions in proteins using Rosetta.使用 Rosetta 对蛋白质中的无规则区域进行建模。

PLoS One. 2011;6(7):e22060. doi: 10.1371/journal.pone.0022060. Epub 2011 Jul 29.

Sequence-dependent and -independent information in a combined random energy model for protein folding and coding.序列相关和不相关信息在蛋白质折叠和编码的组合随机能模型中。

Proteins. 2024 May;92(5):679-687. doi: 10.1002/prot.26658. Epub 2023 Dec 29.

Iterated local search with partition crossover for computational protein design.迭代局部搜索与分区交叉算法在计算蛋白质设计中的应用。

Proteins. 2021 Nov;89(11):1522-1529. doi: 10.1002/prot.26174. Epub 2021 Jul 17.

Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.罗塞塔：无国界医生组织：神经网络：提高多态计算蛋白质设计的性能的神经网络。

PLoS One. 2021 Aug 26;16(8):e0256691. doi: 10.1371/journal.pone.0256691. eCollection 2021.

Funneling and frustration in the energy landscapes of some designed and simplified proteins.某些设计和简化蛋白质的能量景观中的瓶颈和挫折。

J Chem Phys. 2013 Sep 28;139(12):121908. doi: 10.1063/1.4813504.

Structure of the space of folding protein sequences defined by large language models.大语言模型定义的折叠蛋白序列空间结构。

Phys Biol. 2024 Jan 31;21(2). doi: 10.1088/1478-3975/ad205c.

A Rosetta-based protein design protocol converging to natural sequences.基于罗塞塔的蛋白质设计方案，可收敛至天然序列。

J Chem Phys. 2021 Feb 21;154(7):074114. doi: 10.1063/5.0039240.

Perturbing the energy landscape for improved packing during computational protein design.通过计算蛋白质设计中改善堆积时的能量景观来进行干扰。

Proteins. 2021 Apr;89(4):436-449. doi: 10.1002/prot.26030. Epub 2020 Dec 11.

引用本文的文献

AlphaDesign: a de novo protein design framework based on AlphaFold.AlphaDesign：一种基于AlphaFold的全新蛋白质设计框架。

Mol Syst Biol. 2025 Jun 17. doi: 10.1038/s44320-025-00119-z.

Computational design of conformation-biasing mutations to alter protein functions.用于改变蛋白质功能的构象偏向性突变的计算设计。

bioRxiv. 2025 Jun 2:2025.05.03.652001. doi: 10.1101/2025.05.03.652001.

Designed miniproteins potently inhibit and protect against MERS-CoV.设计的微型蛋白质可有效抑制中东呼吸综合征冠状病毒（MERS-CoV）并提供保护。

Cell Rep. 2025 Jun 24;44(6):115760. doi: 10.1016/j.celrep.2025.115760. Epub 2025 May 31.

Expert-guided protein language models enable accurate and blazingly fast fitness prediction.专家指导的蛋白质语言模型可实现准确且超快的适应度预测。

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae621.

Designed miniproteins potently inhibit and protect against MERS-CoV.设计的微型蛋白质能有效抑制中东呼吸综合征冠状病毒并提供防护。

bioRxiv. 2024 Nov 4:2024.11.03.621760. doi: 10.1101/2024.11.03.621760.

Scalable protein design using optimization in a relaxed sequence space.利用松弛序列空间中的优化进行可扩展的蛋白质设计。

Science. 2024 Oct 25;386(6720):439-445. doi: 10.1126/science.adq1741. Epub 2024 Oct 24.

Decoding biology with massively parallel reporter assays and machine learning.利用大规模平行报告基因检测和机器学习解码生物学。

Genes Dev. 2024 Oct 16;38(17-20):843-865. doi: 10.1101/gad.351800.124.

The State-of-the-Art Overview to Application of Deep Learning in Accurate Protein Design and Structure Prediction.深度学习在精确蛋白质设计和结构预测中的应用综述

Top Curr Chem (Cham). 2024 Jul 4;382(3):23. doi: 10.1007/s41061-024-00469-6.

Toward enhancement of antibody thermostability and affinity by computational design in the absence of antigen.通过无抗原的计算设计提高抗体的热稳定性和亲和力。

MAbs. 2024 Jan-Dec;16(1):2362775. doi: 10.1080/19420862.2024.2362775. Epub 2024 Jun 20.

Computational design of soluble and functional membrane protein analogues.可溶性和功能型膜蛋白类似物的计算设计。

Nature. 2024 Jul;631(8020):449-458. doi: 10.1038/s41586-024-07601-y. Epub 2024 Jun 19.

本文引用的文献

Protein sequence design with a learned potential.利用学习到的势能进行蛋白质序列设计。

Nat Commun. 2022 Feb 8;13(1):746. doi: 10.1038/s41467-022-28313-9.

De novo protein design by deep network hallucination.基于深度网络幻觉的从头设计蛋白质。

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

Perturbing the energy landscape for improved packing during computational protein design.通过计算蛋白质设计中改善堆积时的能量景观来进行干扰。

Proteins. 2021 Apr;89(4):436-449. doi: 10.1002/prot.26030. Epub 2020 Dec 11.

An enumerative algorithm for de novo design of proteins with diverse pocket structures.具有多种口袋结构的从头设计蛋白质的枚举算法。

Proc Natl Acad Sci U S A. 2020 Sep 8;117(36):22135-22145. doi: 10.1073/pnas.2005412117. Epub 2020 Aug 24.

Modular repeat protein sculpting using rigid helical junctions.使用刚性螺旋连接进行模块化重复蛋白塑造。

Proc Natl Acad Sci U S A. 2020 Apr 21;117(16):8870-8875. doi: 10.1073/pnas.1908768117. Epub 2020 Apr 3.

Improved protein structure prediction using predicted interresidue orientations.利用预测的残基间取向改进蛋白质结构预测。

Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.

De novo protein design by citizen scientists.公民科学家进行从头蛋白质设计。

Nature. 2019 Jun;570(7761):390-394. doi: 10.1038/s41586-019-1274-4. Epub 2019 Jun 5.

Networks of electrostatic and hydrophobic interactions modulate the complex folding free energy surface of a designed βα protein.静电和疏水相互作用网络调节设计的 βα 蛋白质的复杂折叠自由能表面。

Proc Natl Acad Sci U S A. 2019 Apr 2;116(14):6806-6811. doi: 10.1073/pnas.1818744116. Epub 2019 Mar 15.

Design of metalloproteins and novel protein folds using variational autoencoders.利用变分自动编码器设计金属蛋白和新型蛋白质折叠。

Sci Rep. 2018 Nov 1;8(1):16189. doi: 10.1038/s41598-018-34533-1.

De novo design of a fluorescence-activating β-barrel.从头设计一个荧光激活的β桶。

Nature. 2018 Sep;561(7724):485-491. doi: 10.1038/s41586-018-0509-0. Epub 2018 Sep 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过构象景观优化进行蛋白质序列设计。

Protein sequence design by conformational landscape optimization.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献