基于深度网络幻觉的从头设计蛋白质。

De novo protein design by deep network hallucination.

机构信息

Department of Biochemistry, University of Washington, Seattle, WA, USA.

Institute for Protein Design, University of Washington, Seattle, WA, USA.

出版信息

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

DOI:10.1038/s41586-021-04184-w

PMID:34853475

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9293396/

Abstract

There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences. Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.

摘要

近年来，利用深度神经网络预测氨基酸序列的残基间距离，在蛋白质结构预测方面取得了相当大的进展。在这里，我们研究了这些网络所捕获的信息是否足够丰富，可以生成与用于训练模型的天然存在的蛋白质序列无关的新折叠蛋白质。我们生成随机氨基酸序列，并将其输入 trRosetta 结构预测网络，以预测起始残基残基距离图，正如预期的那样，这些图非常没有特征。然后，我们在氨基酸序列空间中进行蒙特卡罗采样，优化网络预测的残基间距离分布与所有蛋白质平均的背景分布之间的对比度（Kullback-Leibler 散度）。从不同的随机起点进行优化得到了跨越广泛序列和预测结构的新型蛋白质。我们获得了编码网络“幻觉”序列的 129 个合成基因，并在大肠杆菌中表达和纯化了这些蛋白质；其中 27 个蛋白质产生了具有圆二色性光谱的单分散物质，与幻觉结构一致。我们确定了三种幻觉蛋白的三维结构，其中两种通过 X 射线晶体学确定，一种通过 NMR 确定，这些结构与幻觉模型非常吻合。因此，从其序列预测天然蛋白质结构的深度网络可以被反转来设计新的蛋白质，并且这些网络和方法应该与传统基于物理的模型一起，为具有新功能的蛋白质的从头设计做出贡献。

相似文献

De novo protein design by deep network hallucination.

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

CNNcon: improved protein contact maps prediction using cascaded neural networks.

PLoS One. 2013 Apr 23;8(4):e61533. doi: 10.1371/journal.pone.0061533. Print 2013.

The trRosetta server for fast and accurate protein structure prediction.

Nat Protoc. 2021 Dec;16(12):5634-5651. doi: 10.1038/s41596-021-00628-9. Epub 2021 Nov 10.

Protein sequence design by conformational landscape optimization.

Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.

SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structures.

BMC Bioinformatics. 2023 Oct 3;24(1):373. doi: 10.1186/s12859-023-05498-4.

Improved Protein Real-Valued Distance Prediction Using Deep Residual Dense Network (DRDN).

Protein J. 2022 Oct;41(4-5):468-476. doi: 10.1007/s10930-022-10067-4. Epub 2022 Aug 25.

Exploring "dark-matter" protein folds using deep learning.

Cell Syst. 2024 Oct 16;15(10):898-910.e5. doi: 10.1016/j.cels.2024.09.006. Epub 2024 Oct 8.

To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map.

J Chem Inf Model. 2020 Jan 27;60(1):391-399. doi: 10.1021/acs.jcim.9b00438. Epub 2019 Dec 20.

DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure.

PLoS One. 2019 Jan 8;14(1):e0205214. doi: 10.1371/journal.pone.0205214. eCollection 2019.

Predicting dihedral angle probability distributions for protein coil residues from primary sequence using neural networks.

BMC Bioinformatics. 2009 Oct 16;10:338. doi: 10.1186/1471-2105-10-338.

引用本文的文献

One-shot design of functional protein binders with BindCraft.

Nature. 2025 Aug 27. doi: 10.1038/s41586-025-09429-6.

Design of light- and chemically responsive protein assemblies through host-guest interactions.

Chem. 2025 Jun 12;11(6). doi: 10.1016/j.chempr.2024.102407. Epub 2025 Feb 10.

ProT-GFDM: A generative fractional diffusion model for protein generation.

Comput Struct Biotechnol J. 2025 Aug 5;27:3464-3480. doi: 10.1016/j.csbj.2025.07.045. eCollection 2025.

Sequence Determinants of Allosteric Back-to-front Control of the Arf Nucleotide Switch.

J Mol Biol. 2025 Jul 26;437(19):169361. doi: 10.1016/j.jmb.2025.169361.

Artificial intelligence-driven computational methods for antibody design and optimization.

MAbs. 2025 Dec;17(1):2528902. doi: 10.1080/19420862.2025.2528902. Epub 2025 Jul 18.

Artificial intelligence and first-principle methods in protein redesign: A marriage of convenience?

Protein Sci. 2025 Aug;34(8):e70210. doi: 10.1002/pro.70210.

Design of overlapping genes using deep generative models of protein sequences.

bioRxiv. 2025 May 7:2025.05.06.652464. doi: 10.1101/2025.05.06.652464.

AlphaFold distillation for inverse protein design.

Sci Rep. 2025 Jul 1;15(1):21743. doi: 10.1038/s41598-025-00436-1.

AI4Protein: transforming the future of protein design.

Sci China Life Sci. 2025 Jun 20. doi: 10.1007/s11427-024-2906-3.

NanoBinder: a machine learning assisted nanobody binding prediction tool using Rosetta energy scores.

J Cheminform. 2025 Jun 16;17(1):96. doi: 10.1186/s13321-025-01040-1.

本文引用的文献

Low-N protein engineering with data-efficient deep learning.

Nat Methods. 2021 Apr;18(4):389-396. doi: 10.1038/s41592-021-01100-y. Epub 2021 Apr 7.

Protein sequence design by conformational landscape optimization.

Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.

Generating functional protein variants with variational autoencoders.

PLoS Comput Biol. 2021 Feb 26;17(2):e1008736. doi: 10.1371/journal.pcbi.1008736. eCollection 2021 Feb.

Fast and Flexible Protein Design Using Deep Graph Neural Networks.

Cell Syst. 2020 Oct 21;11(4):402-411.e4. doi: 10.1016/j.cels.2020.08.016. Epub 2020 Sep 23.

De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks.

J Chem Inf Model. 2020 Dec 28;60(12):5667-5681. doi: 10.1021/acs.jcim.0c00593. Epub 2020 Sep 30.

Improved protein structure prediction using potentials from deep learning.

Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.

Improved protein structure prediction using predicted interresidue orientations.

Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.

Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).

Proteins. 2019 Dec;87(12):1141-1148. doi: 10.1002/prot.25834.

Deep generative models for T cell receptor protein sequences.

Elife. 2019 Sep 5;8:e46935. doi: 10.7554/eLife.46935.

Distance-based protein folding powered by deep learning.

Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于深度网络幻觉的从头设计蛋白质。

De novo protein design by deep network hallucination.

机构信息

Department of Biochemistry, University of Washington, Seattle, WA, USA.

Institute for Protein Design, University of Washington, Seattle, WA, USA.

出版信息

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

DOI:10.1038/s41586-021-04184-w

PMID:34853475

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9293396/

Abstract

摘要

基于深度网络幻觉的从头设计蛋白质。

De novo protein design by deep network hallucination.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于深度网络幻觉的从头设计蛋白质。

De novo protein design by deep network hallucination.

机构信息

出版信息