深度学习在蛋白质设计中的应用。

Protein Design with Deep Learning.

机构信息

Toulouse Biotechnology Institute, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France.

Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France.

出版信息

Int J Mol Sci. 2021 Oct 29;22(21):11741. doi: 10.3390/ijms222111741.

DOI:10.3390/ijms222111741

PMID:34769173

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8584038/

Abstract

Computational Protein Design (CPD) has produced impressive results for engineering new proteins, resulting in a wide variety of applications. In the past few years, various efforts have aimed at replacing or improving existing design methods using Deep Learning technology to leverage the amount of publicly available protein data. Deep Learning (DL) is a very powerful tool to extract patterns from raw data, provided that data are formatted as mathematical objects and the architecture processing them is well suited to the targeted problem. In the case of protein data, specific representations are needed for both the amino acid sequence and the protein structure in order to capture respectively 1D and 3D information. As no consensus has been reached about the most suitable representations, this review describes the representations used so far, discusses their strengths and weaknesses, and details their associated DL architecture for design and related tasks.

摘要

计算蛋白质设计（CPD）在工程新蛋白质方面取得了令人印象深刻的成果，产生了各种各样的应用。在过去的几年中，各种努力旨在使用深度学习技术取代或改进现有的设计方法，以利用大量可用的蛋白质数据。深度学习（DL）是从原始数据中提取模式的非常强大的工具，前提是数据被格式化为数学对象，并且处理它们的架构适合于目标问题。在蛋白质数据的情况下，需要对氨基酸序列和蛋白质结构进行特定的表示，以便分别捕获 1D 和 3D 信息。由于关于最合适的表示形式尚未达成共识，因此本综述描述了迄今为止使用的表示形式，讨论了它们的优缺点，并详细介绍了它们用于设计和相关任务的相关深度学习架构。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1aba/8584038/e08bcba971b5/ijms-22-11741-g001.jpg

相似文献

Protein Design with Deep Learning.深度学习在蛋白质设计中的应用。

Int J Mol Sci. 2021 Oct 29;22(21):11741. doi: 10.3390/ijms222111741.

Computational Protein Design with Deep Learning Neural Networks.深度学习神经网络的计算蛋白质设计。

Sci Rep. 2018 Apr 20;8(1):6349. doi: 10.1038/s41598-018-24760-x.

A new age in protein design empowered by deep learning.深度学习赋能的蛋白质设计新时代。

Cell Syst. 2023 Nov 15;14(11):925-939. doi: 10.1016/j.cels.2023.10.006.

Drug-Target Interaction Prediction: End-to-End Deep Learning Approach.药物-靶点相互作用预测：端到端深度学习方法。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2364-2374. doi: 10.1109/TCBB.2020.2977335. Epub 2021 Dec 8.

DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction.DeepACLSTM：用于蛋白质二级结构预测的深度非对称卷积长短时记忆神经模型。

BMC Bioinformatics. 2019 Jun 17;20(1):341. doi: 10.1186/s12859-019-2940-0.

Predicting drug-target interaction network using deep learning model.利用深度学习模型预测药物-靶标相互作用网络。

Comput Biol Chem. 2019 Jun;80:90-101. doi: 10.1016/j.compbiolchem.2019.03.016. Epub 2019 Mar 25.

A Comprehensive Survey of Deep Learning Techniques in Protein Function Prediction.深度学习技术在蛋白质功能预测中的综合研究

IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2291-2301. doi: 10.1109/TCBB.2023.3247634. Epub 2023 Jun 5.

Protein Fold Recognition From Sequences Using Convolutional and Recurrent Neural Networks.使用卷积和递归神经网络从序列中识别蛋白质折叠。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2848-2854. doi: 10.1109/TCBB.2020.3012732. Epub 2021 Dec 8.

Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.用于基于序列的蛋白质预测任务的序列表示方法，这些任务使用深度学习。

Brief Funct Genomics. 2021 Mar 2;20(1):61-73. doi: 10.1093/bfgp/elaa030.

Evotuning protocols for Transformer-based variant effect prediction on multi-domain proteins.基于 Transformer 的变体效应预测的多域蛋白的 Evotuning 协议。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab234.

引用本文的文献

IDP-Bert: Predicting Properties of Intrinsically Disordered Proteins Using Large Language Models.IDP-Bert：使用大语言模型预测内在无序蛋白质的特性。

J Phys Chem B. 2024 Dec 12;128(49):12030-12037. doi: 10.1021/acs.jpcb.4c02507. Epub 2024 Nov 25.

Deep learning for discriminating non-trivial conformational changes in molecular dynamics simulations of SARS-CoV-2 spike-ACE2.用于区分 SARS-CoV-2 刺突蛋白-ACE2 分子动力学模拟中非平凡构象变化的深度学习。

Sci Rep. 2024 Sep 30;14(1):22639. doi: 10.1038/s41598-024-72842-w.

Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review.使用基于图神经网络（EGNN）和扩散模型的基于结构的蛋白质和小分子生成：全面综述。

Comput Struct Biotechnol J. 2024 Jun 26;23:2779-2797. doi: 10.1016/j.csbj.2024.06.021. eCollection 2024 Dec.

SPDesign: protein sequence designer based on structural sequence profile using ultrafast shape recognition.SPDesign：基于结构序列轮廓的蛋白质序列设计，使用超快形状识别。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae146.

Graphormer supervised de novo protein design method and function validation.Graphormer 监督从头蛋白质设计方法和功能验证。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae135.

Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning.大规模深度突变扫描揭示酶活性位点中的上位相互作用网络。

Proc Natl Acad Sci U S A. 2024 Mar 19;121(12):e2313513121. doi: 10.1073/pnas.2313513121. Epub 2024 Mar 14.

Machine-learning-guided Directed Evolution for AAV Capsid Engineering.基于机器学习的腺相关病毒衣壳工程定向进化

Curr Pharm Des. 2024;30(11):811-824. doi: 10.2174/0113816128286593240226060318.

Leveraging Artificial Intelligence to Expedite Antibody Design and Enhance Antibody-Antigen Interactions.利用人工智能加速抗体设计并增强抗体-抗原相互作用。

Bioengineering (Basel). 2024 Feb 15;11(2):185. doi: 10.3390/bioengineering11020185.

ProteinMPNN Recovers Complex Sequence Properties of Transmembrane β-barrels.ProteinMPNN恢复跨膜β桶的复杂序列特性。

bioRxiv. 2024 Feb 1:2024.01.16.575764. doi: 10.1101/2024.01.16.575764.

Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review.智能蛋白质设计与分子特征技术：全面综述。

Molecules. 2023 Nov 30;28(23):7865. doi: 10.3390/molecules28237865.

本文引用的文献

Generating tertiary protein structures via interpretable graph variational autoencoders.通过可解释的图变分自编码器生成三级蛋白质结构。

Bioinform Adv. 2021 Nov 29;1(1):vbab036. doi: 10.1093/bioadv/vbab036. eCollection 2021.

PDBench: evaluating computational methods for protein-sequence design.PDBench：评估蛋白质序列设计的计算方法。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad027.

Mimetic Neural Networks: A Unified Framework for Protein Design and Folding.模拟神经网络：蛋白质设计与折叠的统一框架。

Front Bioinform. 2022 May 5;2:715006. doi: 10.3389/fbinf.2022.715006. eCollection 2022.

Ig-VAE: Generative modeling of protein structure by direct 3D coordinate generation.Ig-VAE：通过直接 3D 坐标生成对蛋白质结构进行生成式建模。

PLoS Comput Biol. 2022 Jun 27;18(6):e1010271. doi: 10.1371/journal.pcbi.1010271. eCollection 2022 Jun.

Protein sequence design with a learned potential.利用学习到的势能进行蛋白质序列设计。

Nat Commun. 2022 Feb 8;13(1):746. doi: 10.1038/s41467-022-28313-9.

De novo protein design by deep network hallucination.基于深度网络幻觉的从头设计蛋白质。

Nature. 2021 Dec;600(7889):547-552. doi: 10.1038/s41586-021-04184-w. Epub 2021 Dec 1.

Efficient generative modeling of protein sequences using simple autoregressive models.使用简单自回归模型高效生成蛋白质序列。

Nat Commun. 2021 Oct 4;12(1):5800. doi: 10.1038/s41467-021-25756-4.

Protein sequence-to-structure learning: Is this the end(-to-end revolution)?蛋白质序列到结构的学习：这是（端到端革命）的终结吗？

Proteins. 2021 Dec;89(12):1770-1786. doi: 10.1002/prot.26235. Epub 2021 Sep 22.

Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14.使用深度学习和 Rosetta 在 CASP14 中进行蛋白质三级结构预测和精修。

Proteins. 2021 Dec;89(12):1722-1733. doi: 10.1002/prot.26194. Epub 2021 Aug 17.

Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14.使用深度学习距离和氢键约束进行 CASP14 中的蛋白质结构预测。

Proteins. 2021 Dec;89(12):1734-1751. doi: 10.1002/prot.26193. Epub 2021 Aug 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

深度学习在蛋白质设计中的应用。

Protein Design with Deep Learning.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献