• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度网络幻觉的从头设计蛋白质。

De novo protein design by deep network hallucination.

机构信息

Department of Biochemistry, University of Washington, Seattle, WA, USA.

Institute for Protein Design, University of Washington, Seattle, WA, USA.

出版信息

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

DOI:10.1038/s41586-021-04184-w
PMID:34853475
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9293396/
Abstract

There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences. Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.

摘要

近年来,利用深度神经网络预测氨基酸序列的残基间距离,在蛋白质结构预测方面取得了相当大的进展。在这里,我们研究了这些网络所捕获的信息是否足够丰富,可以生成与用于训练模型的天然存在的蛋白质序列无关的新折叠蛋白质。我们生成随机氨基酸序列,并将其输入 trRosetta 结构预测网络,以预测起始残基残基距离图,正如预期的那样,这些图非常没有特征。然后,我们在氨基酸序列空间中进行蒙特卡罗采样,优化网络预测的残基间距离分布与所有蛋白质平均的背景分布之间的对比度(Kullback-Leibler 散度)。从不同的随机起点进行优化得到了跨越广泛序列和预测结构的新型蛋白质。我们获得了编码网络“幻觉”序列的 129 个合成基因,并在大肠杆菌中表达和纯化了这些蛋白质;其中 27 个蛋白质产生了具有圆二色性光谱的单分散物质,与幻觉结构一致。我们确定了三种幻觉蛋白的三维结构,其中两种通过 X 射线晶体学确定,一种通过 NMR 确定,这些结构与幻觉模型非常吻合。因此,从其序列预测天然蛋白质结构的深度网络可以被反转来设计新的蛋白质,并且这些网络和方法应该与传统基于物理的模型一起,为具有新功能的蛋白质的从头设计做出贡献。

相似文献

1
De novo protein design by deep network hallucination.基于深度网络幻觉的从头设计蛋白质。
Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.
2
CNNcon: improved protein contact maps prediction using cascaded neural networks.CNNcon:使用级联神经网络改进蛋白质接触图预测。
PLoS One. 2013 Apr 23;8(4):e61533. doi: 10.1371/journal.pone.0061533. Print 2013.
3
The trRosetta server for fast and accurate protein structure prediction.TrRosetta 服务器:用于快速准确的蛋白质结构预测。
Nat Protoc. 2021 Dec;16(12):5634-5651. doi: 10.1038/s41596-021-00628-9. Epub 2021 Nov 10.
4
Protein sequence design by conformational landscape optimization.通过构象景观优化进行蛋白质序列设计。
Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.
5
SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structures.SeqPredNN:一种可以生成折叠成特定三级结构的蛋白质序列的神经网络。
BMC Bioinformatics. 2023 Oct 3;24(1):373. doi: 10.1186/s12859-023-05498-4.
6
Improved Protein Real-Valued Distance Prediction Using Deep Residual Dense Network (DRDN).利用深度残差密集网络(DRDN)提高蛋白质实值距离预测。
Protein J. 2022 Oct;41(4-5):468-476. doi: 10.1007/s10930-022-10067-4. Epub 2022 Aug 25.
7
Exploring "dark-matter" protein folds using deep learning.利用深度学习探索“暗物质”蛋白折叠。
Cell Syst. 2024 Oct 16;15(10):898-910.e5. doi: 10.1016/j.cels.2024.09.006. Epub 2024 Oct 8.
8
To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map.通过对成对残差距离图的图像字幕提高蛋白质序列轮廓预测。
J Chem Inf Model. 2020 Jan 27;60(1):391-399. doi: 10.1021/acs.jcim.9b00438. Epub 2019 Dec 20.
9
DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure.DeepCDpred:用于改进蛋白质结构预测的残差间距离和接触预测。
PLoS One. 2019 Jan 8;14(1):e0205214. doi: 10.1371/journal.pone.0205214. eCollection 2019.
10
Predicting dihedral angle probability distributions for protein coil residues from primary sequence using neural networks.利用神经网络从蛋白质序列预测无规卷曲残基的二面角概率分布。
BMC Bioinformatics. 2009 Oct 16;10:338. doi: 10.1186/1471-2105-10-338.

引用本文的文献

1
One-shot design of functional protein binders with BindCraft.利用BindCraft进行功能性蛋白质结合剂的一次性设计。
Nature. 2025 Aug 27. doi: 10.1038/s41586-025-09429-6.
2
Design of light- and chemically responsive protein assemblies through host-guest interactions.通过主客体相互作用设计光响应和化学响应蛋白组装体。
Chem. 2025 Jun 12;11(6). doi: 10.1016/j.chempr.2024.102407. Epub 2025 Feb 10.
3
ProT-GFDM: A generative fractional diffusion model for protein generation.ProT-GFDM:一种用于蛋白质生成的生成式分数扩散模型。

本文引用的文献

1
Low-N protein engineering with data-efficient deep learning.低蛋白工程与数据高效深度学习。
Nat Methods. 2021 Apr;18(4):389-396. doi: 10.1038/s41592-021-01100-y. Epub 2021 Apr 7.
2
Protein sequence design by conformational landscape optimization.通过构象景观优化进行蛋白质序列设计。
Proc Natl Acad Sci U S A. 2021 Mar 16;118(11). doi: 10.1073/pnas.2017228118.
3
Generating functional protein variants with variational autoencoders.利用变分自动编码器生成功能性蛋白质变体。
Comput Struct Biotechnol J. 2025 Aug 5;27:3464-3480. doi: 10.1016/j.csbj.2025.07.045. eCollection 2025.
4
Sequence Determinants of Allosteric Back-to-front Control of the Arf Nucleotide Switch.Arf核苷酸开关变构前后控制的序列决定因素
J Mol Biol. 2025 Jul 26;437(19):169361. doi: 10.1016/j.jmb.2025.169361.
5
Artificial intelligence-driven computational methods for antibody design and optimization.用于抗体设计与优化的人工智能驱动的计算方法。
MAbs. 2025 Dec;17(1):2528902. doi: 10.1080/19420862.2025.2528902. Epub 2025 Jul 18.
6
Artificial intelligence and first-principle methods in protein redesign: A marriage of convenience?蛋白质重新设计中的人工智能与第一性原理方法:权宜之计的结合?
Protein Sci. 2025 Aug;34(8):e70210. doi: 10.1002/pro.70210.
7
Design of overlapping genes using deep generative models of protein sequences.使用蛋白质序列深度生成模型设计重叠基因。
bioRxiv. 2025 May 7:2025.05.06.652464. doi: 10.1101/2025.05.06.652464.
8
AlphaFold distillation for inverse protein design.用于反向蛋白质设计的AlphaFold蒸馏法。
Sci Rep. 2025 Jul 1;15(1):21743. doi: 10.1038/s41598-025-00436-1.
9
AI4Protein: transforming the future of protein design.AI4Protein:变革蛋白质设计的未来。
Sci China Life Sci. 2025 Jun 20. doi: 10.1007/s11427-024-2906-3.
10
NanoBinder: a machine learning assisted nanobody binding prediction tool using Rosetta energy scores.纳米抗体结合预测器:一种使用罗塞塔能量分数的机器学习辅助纳米抗体结合预测工具。
J Cheminform. 2025 Jun 16;17(1):96. doi: 10.1186/s13321-025-01040-1.
PLoS Comput Biol. 2021 Feb 26;17(2):e1008736. doi: 10.1371/journal.pcbi.1008736. eCollection 2021 Feb.
4
Fast and Flexible Protein Design Using Deep Graph Neural Networks.利用深度图神经网络实现快速灵活的蛋白质设计。
Cell Syst. 2020 Oct 21;11(4):402-411.e4. doi: 10.1016/j.cels.2020.08.016. Epub 2020 Sep 23.
5
De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks.从头设计新型折叠结构的蛋白质:使用引导条件 Wasserstein 生成对抗网络。
J Chem Inf Model. 2020 Dec 28;60(12):5667-5681. doi: 10.1021/acs.jcim.0c00593. Epub 2020 Sep 30.
6
Improved protein structure prediction using potentials from deep learning.利用深度学习势进行蛋白质结构预测的改进。
Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.
7
Improved protein structure prediction using predicted interresidue orientations.利用预测的残基间取向改进蛋白质结构预测。
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.
8
Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).使用多个深度神经网络进行蛋白质结构预测在第十三届蛋白质结构预测关键评估 (CASP13) 中。
Proteins. 2019 Dec;87(12):1141-1148. doi: 10.1002/prot.25834.
9
Deep generative models for T cell receptor protein sequences.深度生成模型在 T 细胞受体蛋白序列中的应用。
Elife. 2019 Sep 5;8:e46935. doi: 10.7554/eLife.46935.
10
Distance-based protein folding powered by deep learning.基于深度学习的距离相关蛋白质折叠。
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.