• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TIMED-Design:使用卷积神经网络实现灵活且易于访问的蛋白质序列设计。

TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks.

机构信息

School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB United Kingdom.

School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, United Kingdom.

出版信息

Protein Eng Des Sel. 2024 Jan 29;37. doi: 10.1093/protein/gzae002.

DOI:10.1093/protein/gzae002
PMID:38288671
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10939383/
Abstract

Sequence design is a crucial step in the process of designing or engineering proteins. Traditionally, physics-based methods have been used to solve for optimal sequences, with the main disadvantages being that they are computationally intensive for the end user. Deep learning-based methods offer an attractive alternative, outperforming physics-based methods at a significantly lower computational cost. In this paper, we explore the application of Convolutional Neural Networks (CNNs) for sequence design. We describe the development and benchmarking of a range of networks, as well as reimplementations of previously described CNNs. We demonstrate the flexibility of representing proteins in a three-dimensional voxel grid by encoding additional design constraints into the input data. Finally, we describe TIMED-Design, a web application and command line tool for exploring and applying the models described in this paper. The user interface will be available at the URL: https://pragmaticproteindesign.bio.ed.ac.uk/timed. The source code for TIMED-Design is available at https://github.com/wells-wood-research/timed-design.

摘要

序列设计是设计或工程蛋白质过程中的关键步骤。传统上,基于物理的方法被用于求解最优序列,其主要缺点是对最终用户来说计算量很大。基于深度学习的方法提供了一种有吸引力的替代方法,以显著较低的计算成本超过基于物理的方法。在本文中,我们探讨了卷积神经网络(CNN)在序列设计中的应用。我们描述了一系列网络的开发和基准测试,以及之前描述的 CNN 的重新实现。我们通过将额外的设计约束编码到输入数据中,展示了在三维体素网格中表示蛋白质的灵活性。最后,我们描述了 TIMED-Design,这是一个用于探索和应用本文中描述的模型的 Web 应用程序和命令行工具。用户界面将在以下 URL 可用:https://pragmaticproteindesign.bio.ed.ac.uk/timed。TIMED-Design 的源代码可在以下网址获得:https://github.com/wells-wood-research/timed-design。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/906fe7edc622/gzae002f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/bf638026c4c6/gzae002ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/5b33eece6d08/gzae002f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/f4b6a9e1a8f2/gzae002f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/8d30664a1f44/gzae002f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/1f1d5c1a2365/gzae002f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/e49be2b90599/gzae002f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/906fe7edc622/gzae002f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/bf638026c4c6/gzae002ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/5b33eece6d08/gzae002f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/f4b6a9e1a8f2/gzae002f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/8d30664a1f44/gzae002f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/1f1d5c1a2365/gzae002f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/e49be2b90599/gzae002f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30cc/10939383/906fe7edc622/gzae002f6.jpg

相似文献

1
TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks.TIMED-Design:使用卷积神经网络实现灵活且易于访问的蛋白质序列设计。
Protein Eng Des Sel. 2024 Jan 29;37. doi: 10.1093/protein/gzae002.
2
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。
BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.
3
Performance improvement for a 2D convolutional neural network by using SSC encoding on protein-protein interaction tasks.利用 SSC 编码提高二维卷积神经网络在蛋白质相互作用任务上的性能。
BMC Bioinformatics. 2021 Apr 12;22(1):184. doi: 10.1186/s12859-021-04111-w.
4
Improved Protein-Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference.基于结构的深度融合推理提高蛋白-配体结合亲和力预测。
J Chem Inf Model. 2021 Apr 26;61(4):1583-1592. doi: 10.1021/acs.jcim.0c01306. Epub 2021 Mar 23.
5
DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment.DeepECA:一种基于多重序列比对的蛋白质接触预测端到端学习框架。
BMC Bioinformatics. 2020 Jan 9;21(1):10. doi: 10.1186/s12859-019-3190-x.
6
Convolutional neural networks with image representation of amino acid sequences for protein function prediction.基于氨基酸序列图像表示的卷积神经网络用于蛋白质功能预测。
Comput Biol Chem. 2021 Jun;92:107494. doi: 10.1016/j.compbiolchem.2021.107494. Epub 2021 Apr 24.
7
Accurate Prediction of Human Essential Proteins Using Ensemble Deep Learning.使用集成深度学习准确预测人类必需蛋白质
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3263-3271. doi: 10.1109/TCBB.2021.3122294. Epub 2022 Dec 8.
8
Protein secondary structure prediction improved by recurrent neural networks integrated with two-dimensional convolutional neural networks.通过与二维卷积神经网络集成的循环神经网络改进蛋白质二级结构预测。
J Bioinform Comput Biol. 2018 Oct;16(5):1850021. doi: 10.1142/S021972001850021X.
9
Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles.通过在深度学习和 PSSM 特征中加入超参数优化来鉴定网格蛋白蛋白。
Comput Methods Programs Biomed. 2019 Aug;177:81-88. doi: 10.1016/j.cmpb.2019.05.016. Epub 2019 May 17.
10
Evaluation of multislice inputs to convolutional neural networks for medical image segmentation.评估卷积神经网络的多切片输入在医学图像分割中的应用。
Med Phys. 2020 Dec;47(12):6216-6231. doi: 10.1002/mp.14391. Epub 2020 Nov 10.

引用本文的文献

1
Assessing the generalization capabilities of TCR binding predictors via peptide distance analysis.通过肽段距离分析评估TCR结合预测器的泛化能力。
PLoS One. 2025 May 20;20(5):e0324011. doi: 10.1371/journal.pone.0324011. eCollection 2025.
2
Protein engineering in the deep learning era.深度学习时代的蛋白质工程。
mLife. 2024 Dec 26;3(4):477-491. doi: 10.1002/mlf2.12157. eCollection 2024 Dec.

本文引用的文献

1
ProGen2: Exploring the boundaries of protein language models.ProGen2:探索蛋白质语言模型的边界。
Cell Syst. 2023 Nov 15;14(11):968-978.e3. doi: 10.1016/j.cels.2023.10.002. Epub 2023 Oct 30.
2
De novo design of protein structure and function with RFdiffusion.利用 RFdiffusion 从头设计蛋白质结构和功能。
Nature. 2023 Aug;620(7976):1089-1100. doi: 10.1038/s41586-023-06415-8. Epub 2023 Jul 11.
3
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
4
PDBench: evaluating computational methods for protein-sequence design.PDBench:评估蛋白质序列设计的计算方法。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad027.
5
Stmol: A component for building interactive molecular visualizations within streamlit web-applications.Stmol:一个用于在Streamlit网络应用程序中构建交互式分子可视化的组件。
Front Mol Biosci. 2022 Sep 23;9:990846. doi: 10.3389/fmolb.2022.990846. eCollection 2022.
6
Robust deep learning-based protein sequence design using ProteinMPNN.使用 ProteinMPNN 进行健壮的基于深度学习的蛋白质序列设计。
Science. 2022 Oct 7;378(6615):49-56. doi: 10.1126/science.add2187. Epub 2022 Sep 15.
7
ProtGPT2 is a deep unsupervised language model for protein design.ProtGPT2 是一个用于蛋白质设计的深度无监督语言模型。
Nat Commun. 2022 Jul 27;13(1):4348. doi: 10.1038/s41467-022-32007-7.
8
ColabFold: making protein folding accessible to all.ColabFold:让蛋白质折叠变得人人可用。
Nat Methods. 2022 Jun;19(6):679-682. doi: 10.1038/s41592-022-01488-1. Epub 2022 May 30.
9
Accurate positioning of functional residues with robotics-inspired computational protein design.运用机器人启发的计算蛋白质设计进行功能残基的精确定位。
Proc Natl Acad Sci U S A. 2022 Mar 15;119(11):e2115480119. doi: 10.1073/pnas.2115480119. Epub 2022 Mar 7.
10
Protein sequence design with a learned potential.利用学习到的势能进行蛋白质序列设计。
Nat Commun. 2022 Feb 8;13(1):746. doi: 10.1038/s41467-022-28313-9.