基于深度递归和卷积架构集成的蛋白质固有无序性的精确单序列预测。

Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures.

机构信息

Signal Processing Laboratory , Griffith University , Brisbane , Queensland 4122 , Australia.

Institute for Glycomics and School of Information and Communication Technology , Griffith University , Southport , Queensland 4222 , Australia.

出版信息

J Chem Inf Model. 2018 Nov 26;58(11):2369-2376. doi: 10.1021/acs.jcim.8b00636. Epub 2018 Nov 13.

DOI:10.1021/acs.jcim.8b00636

PMID:30395465

Abstract

Recognizing the widespread existence of intrinsically disordered regions in proteins spurred the development of computational techniques for their detection. All existing techniques can be classified into methods relying on single-sequence information and those relying on evolutionary sequence profiles generated from multiple-sequence alignments. The methods based on sequence profiles are, in general, more accurate because the presence or absence of conserved amino acid residues in a protein sequence provides important information on the structural and functional roles of the residues. However, the wide applicability of profile-based techniques is limited by time-consuming calculation of sequence profiles. Here we demonstrate that the performance gap between profile-based techniques and single-sequence methods can be reduced by using an ensemble of deep recurrent and convolutional neural networks that allow whole-sequence learning. In particular, the single-sequence method (called SPOT-Disorder-Single) is more accurate than SPOT-Disorder (a profile-based method) for proteins with few homologous sequences and comparable for proteins in predicting long-disordered regions. The method performance is robust across four independent test sets with different amounts of short- and long-disordered regions. SPOT-Disorder-Single is available as a Web server and as a standalone program at http://sparks-lab.org/jack/server/SPOT-Disorder-Single .

摘要

识别蛋白质中普遍存在的无序区域，激发了用于检测它们的计算技术的发展。所有现有的技术都可以分为依赖于单序列信息的方法和依赖于从多序列比对生成的进化序列轮廓的方法。基于序列轮廓的方法通常更准确，因为蛋白质序列中保守氨基酸残基的存在或缺失为残基的结构和功能作用提供了重要信息。然而，基于轮廓的技术的广泛适用性受到序列轮廓计算耗时的限制。在这里，我们证明通过使用允许整个序列学习的深度递归和卷积神经网络的集合，可以缩小基于轮廓的技术和单序列方法之间的性能差距。特别是，对于同源序列较少的蛋白质，单序列方法（称为 SPOT-Disorder-Single）比基于轮廓的方法（SPOT-Disorder）更准确，并且在预测长无序区域方面与蛋白质相当。该方法在四个具有不同数量短和长无序区域的独立测试集中具有稳健的性能。SPOT-Disorder-Single 可作为 Web 服务器和独立程序在 http://sparks-lab.org/jack/server/SPOT-Disorder-Single 上使用。

相似文献

Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures.

J Chem Inf Model. 2018 Nov 26;58(11):2369-2376. doi: 10.1021/acs.jcim.8b00636. Epub 2018 Nov 13.

SPOT-Disorder2: Improved Protein Intrinsic Disorder Prediction by Ensembled Deep Learning.

Genomics Proteomics Bioinformatics. 2019 Dec;17(6):645-656. doi: 10.1016/j.gpb.2019.01.004. Epub 2020 Mar 13.

Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning.

Bioinformatics. 2020 Feb 15;36(4):1107-1113. doi: 10.1093/bioinformatics/btz691.

Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks.

Bioinformatics. 2017 Mar 1;33(5):685-692. doi: 10.1093/bioinformatics/btw678.

Identifying short disorder-to-order binding regions in disordered proteins with a deep convolutional neural network method.

J Bioinform Comput Biol. 2019 Feb;17(1):1950004. doi: 10.1142/S0219720019500045.

Proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins.

J Comput Aided Mol Des. 2017 May;31(5):453-466. doi: 10.1007/s10822-017-0020-y. Epub 2017 Apr 1.

Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks.

Bioinformatics. 2018 Dec 1;34(23):4039-4045. doi: 10.1093/bioinformatics/bty481.

cnnAlpha: Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks.

Proteins. 2020 Nov;88(11):1472-1481. doi: 10.1002/prot.25966. Epub 2020 Aug 7.

DeepCNF-D: Predicting Protein Order/Disorder Regions by Weighted Deep Convolutional Neural Fields.

Int J Mol Sci. 2015 Jul 29;16(8):17315-30. doi: 10.3390/ijms160817315.

POODLE: tools predicting intrinsically disordered regions of amino acid sequence.

Methods Mol Biol. 2014;1137:131-45. doi: 10.1007/978-1-4939-0366-5_10.

引用本文的文献

Advancements in one-dimensional protein structure prediction using machine learning and deep learning.

Comput Struct Biotechnol J. 2025 Apr 3;27:1416-1430. doi: 10.1016/j.csbj.2025.04.005. eCollection 2025.

Taxonomy-specific assessment of intrinsic disorder predictions at residue and region levels in higher eukaryotes, protists, archaea, bacteria and viruses.

Comput Struct Biotechnol J. 2024 Apr 27;23:1968-1977. doi: 10.1016/j.csbj.2024.04.059. eCollection 2024 Dec.

Assessment of Disordered Linker Predictions in the CAID2 Experiment.

Biomolecules. 2024 Feb 28;14(3):287. doi: 10.3390/biom14030287.

Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.

Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023.

New Insights into Radio-Resistance Mechanism Revealed by (Phospho)Proteome Analysis of after Heavy Ion Irradiation.

Int J Mol Sci. 2023 Oct 1;24(19):14817. doi: 10.3390/ijms241914817.

Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins.

Nat Protoc. 2023 Nov;18(11):3157-3172. doi: 10.1038/s41596-023-00876-x. Epub 2023 Sep 22.

CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins.

Nucleic Acids Res. 2023 Jul 5;51(W1):W62-W69. doi: 10.1093/nar/gkad430.

Role of the proline-rich disordered domain of DROSHA in intronic microRNA processing.

Genes Dev. 2023 May 1;37(9-10):383-397. doi: 10.1101/gad.350275.122. Epub 2023 May 26.

DEPICTER2: a comprehensive webserver for intrinsic disorder and disorder function prediction.

Nucleic Acids Res. 2023 Jul 5;51(W1):W141-W147. doi: 10.1093/nar/gkad330.

AFTM: a database of transmembrane regions in the human proteome predicted by AlphaFold.

Database (Oxford). 2023 Mar 14;2023. doi: 10.1093/database/baad008.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于深度递归和卷积架构集成的蛋白质固有无序性的精确单序列预测。

Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures.

机构信息

Signal Processing Laboratory , Griffith University , Brisbane , Queensland 4122 , Australia.

Institute for Glycomics and School of Information and Communication Technology , Griffith University , Southport , Queensland 4222 , Australia.

出版信息

J Chem Inf Model. 2018 Nov 26;58(11):2369-2376. doi: 10.1021/acs.jcim.8b00636. Epub 2018 Nov 13.

DOI:10.1021/acs.jcim.8b00636

PMID:30395465

Abstract

摘要

基于深度递归和卷积架构集成的蛋白质固有无序性的精确单序列预测。

Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于深度递归和卷积架构集成的蛋白质固有无序性的精确单序列预测。

Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures.

机构信息

出版信息

相似文献

引用本文的文献