ODiNPred：蛋白质有序区和无序区的综合预测。

ODiNPred: comprehensive prediction of protein order and disorder.

机构信息

Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark.

Department of Chemistry, Aarhus University, Langelandsgade 140, 8000, Aarhus C, Denmark.

出版信息

Sci Rep. 2020 Sep 8;10(1):14780. doi: 10.1038/s41598-020-71716-1.

DOI:10.1038/s41598-020-71716-1

PMID:32901090

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7479119/

Abstract

Structural disorder is widespread in eukaryotic proteins and is vital for their function in diverse biological processes. It is therefore highly desirable to be able to predict the degree of order and disorder from amino acid sequence. It is, however, notoriously difficult to predict the degree of local flexibility within structured domains and the presence and nuances of localized rigidity within intrinsically disordered regions. To identify such instances, we used the CheZOD database, which encompasses accurate, balanced, and continuous-valued quantification of protein (dis)order at amino acid resolution based on NMR chemical shifts. To computationally forecast the spectrum of protein disorder in the most comprehensive manner possible, we constructed the sequence-based protein order/disorder predictor ODiNPred, trained on an expanded version of CheZOD. ODiNPred applies a deep neural network comprising 157 unique sequence features to 1325 protein sequences together with the experimental NMR chemical shift data. Cross-validation for 117 protein sequences shows that ODiNPred better predicts the continuous variation in order along the protein sequence, suggesting that contemporary predictors are limited by the quality of training data. The inclusion of evolutionary features reduces the performance gap between ODiNPred and its peers, but analysis shows that it retains greater accuracy for the more challenging prediction of intermediate disorder.

摘要

结构无序在真核蛋白中广泛存在，对其在多种生物过程中的功能至关重要。因此，能够根据氨基酸序列预测有序和无序的程度是非常理想的。然而，要预测结构域内局部灵活性的程度以及固有无序区域中局部刚性的存在和细微差别，这是非常困难的。为了识别这些实例，我们使用了 CheZOD 数据库，该数据库基于 NMR 化学位移，以准确、平衡和连续值的方式对氨基酸分辨率下的蛋白质（无序）进行量化。为了尽可能全面地预测蛋白质无序的谱，我们构建了基于序列的蛋白质有序/无序预测器 ODiNPred，该预测器是基于 CheZOD 的扩展版本进行训练的。ODiNPred 应用了一个由 157 个独特序列特征组成的深度神经网络，对 1325 条蛋白质序列以及实验 NMR 化学位移数据进行了处理。对 117 条蛋白质序列的交叉验证表明，ODiNPred 可以更好地预测蛋白质序列中有序的连续变化，这表明当前的预测器受到训练数据质量的限制。进化特征的加入缩小了 ODiNPred 与其同行之间的性能差距，但分析表明，它在更具挑战性的中间无序预测方面保持了更高的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df24/7479119/8037084c12a6/41598_2020_71716_Fig2_HTML.jpg

相似文献

ODiNPred: comprehensive prediction of protein order and disorder.

Sci Rep. 2020 Sep 8;10(1):14780. doi: 10.1038/s41598-020-71716-1.

Proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins.

J Comput Aided Mol Des. 2017 May;31(5):453-466. doi: 10.1007/s10822-017-0020-y. Epub 2017 Apr 1.

Quality and bias of protein disorder predictors.

Sci Rep. 2019 Mar 26;9(1):5137. doi: 10.1038/s41598-019-41644-w.

DisoMCS: Accurately Predicting Protein Intrinsically Disordered Regions Using a Multi-Class Conservative Score Approach.

PLoS One. 2015 Jun 19;10(6):e0128334. doi: 10.1371/journal.pone.0128334. eCollection 2015.

Computational Prediction of Intrinsic Disorder in Proteins.

Curr Protoc Protein Sci. 2017 Apr 3;88:2.16.1-2.16.14. doi: 10.1002/cpps.28.

Predicting Conformational Disorder.

Methods Mol Biol. 2016;1415:265-99. doi: 10.1007/978-1-4939-3572-7_14.

A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction.

Brief Bioinform. 2019 Jan 18;20(1):330-346. doi: 10.1093/bib/bbx126.

The s2D method: simultaneous sequence-based prediction of the statistical populations of ordered and disordered regions in proteins.

J Mol Biol. 2015 Feb 27;427(4):982-996. doi: 10.1016/j.jmb.2014.12.007. Epub 2014 Dec 20.

cnnAlpha: Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks.

Proteins. 2020 Nov;88(11):1472-1481. doi: 10.1002/prot.25966. Epub 2020 Aug 7.

Discovering MoRFs by trisecting intrinsically disordered protein sequence into terminals and middle regions.

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):378. doi: 10.1186/s12859-018-2396-7.

引用本文的文献

A computational Evo-Devo approach for elucidating the roles of PLETHORA transcription factors in regulating root development.

PLoS One. 2025 Jul 31;20(7):e0327511. doi: 10.1371/journal.pone.0327511. eCollection 2025.

PEGASUS: Prediction of MD-derived protein flexibility from sequence.

Protein Sci. 2025 Aug;34(8):e70221. doi: 10.1002/pro.70221.

Specificity in clustering of gene-specific transcription factors is encoded in the genome.

Nucleic Acids Res. 2025 Jul 8;53(13). doi: 10.1093/nar/gkaf625.

Prediction of secreted uncharacterized protein structures from Beauveria bassiana ARSEF 2860 unravels novel toxins-like families.

Sci Rep. 2025 May 22;15(1):17747. doi: 10.1038/s41598-025-02618-3.

A novel region within a conserved domain in ATG7 emerged in vertebrates.

Autophagy Rep. 2022 Sep 7;1(1):393-413. doi: 10.1080/27694127.2022.2118933. eCollection 2022.

Assignment of the N-terminal domain of mouse cGAS.

Biomol NMR Assign. 2025 Jun;19(1):35-39. doi: 10.1007/s12104-024-10213-2. Epub 2025 Jan 4.

Human introns contain conserved tissue-specific cryptic poison exons.

NAR Genom Bioinform. 2024 Dec 11;6(4):lqae163. doi: 10.1093/nargab/lqae163. eCollection 2024 Dec.

Assessing the role of evolutionary information for enhancing protein language model embeddings.

Sci Rep. 2024 Sep 5;14(1):20692. doi: 10.1038/s41598-024-71783-8.

Fine-tuning protein language models boosts predictions across diverse tasks.

Nat Commun. 2024 Aug 28;15(1):7407. doi: 10.1038/s41467-024-51844-2.

Transient Structural Properties of the Rho GDP-Dissociation Inhibitor.

Angew Chem Int Ed Engl. 2024 Aug 19;63(34):e202403941. doi: 10.1002/anie.202403941. Epub 2024 Jul 24.

本文引用的文献

The Ambivalent Role of Proline Residues in an Intrinsically Disordered Protein: From Disorder Promoters to Compaction Facilitators.

J Mol Biol. 2020 Apr 17;432(9):3093-3111. doi: 10.1016/j.jmb.2019.11.015. Epub 2019 Nov 30.

Quality and bias of protein disorder predictors.

Sci Rep. 2019 Mar 26;9(1):5137. doi: 10.1038/s41598-019-41644-w.

The role of liquid-liquid phase separation in aggregation of the TDP-43 low-complexity domain.

J Biol Chem. 2019 Apr 19;294(16):6306-6317. doi: 10.1074/jbc.RA118.007222. Epub 2019 Feb 27.

Dynamic Studies on Intrinsically Disordered Regions of Two Paralogous Transcription Factors Reveal Rigid Segments with Important Biological Functions.

J Mol Biol. 2019 Mar 29;431(7):1353-1369. doi: 10.1016/j.jmb.2019.02.021. Epub 2019 Feb 22.

IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding.

Nucleic Acids Res. 2018 Jul 2;46(W1):W329-W337. doi: 10.1093/nar/gky384.

Editorial overview: Theory and simulation: Interpreting experimental data at the molecular level.

Curr Opin Struct Biol. 2018 Apr;49:iv-v. doi: 10.1016/j.sbi.2018.04.002.

Disruption of ER-mitochondria signalling in fronto-temporal dementia and related amyotrophic lateral sclerosis.

Cell Death Dis. 2018 Feb 28;9(3):327. doi: 10.1038/s41419-017-0022-7.

Prion-like properties of disease-relevant proteins in amyotrophic lateral sclerosis.

J Neural Transm (Vienna). 2018 Apr;125(4):591-613. doi: 10.1007/s00702-018-1851-y. Epub 2018 Feb 8.

POTENCI: prediction of temperature, neighbor and pH-corrected chemical shifts for intrinsically disordered proteins.

J Biomol NMR. 2018 Mar;70(3):141-165. doi: 10.1007/s10858-018-0166-5. Epub 2018 Feb 5.

pepKalc: scalable and comprehensive calculation of electrostatic interactions in random coil polypeptides.

Bioinformatics. 2018 Jun 15;34(12):2053-2060. doi: 10.1093/bioinformatics/bty033.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ODiNPred：蛋白质有序区和无序区的综合预测。

ODiNPred: comprehensive prediction of protein order and disorder.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献