APOD：基于序列的无序柔性接头准确预测器。

APOD: accurate sequence-based predictor of disordered flexible linkers.

机构信息

Center for Applied Mathematics, Tianjin University, Tianjin 300072, China.

School of Statistics and Data Science, Nankai University, Tianjin 300074, China.

出版信息

Bioinformatics. 2020 Dec 30;36(Suppl_2):i754-i761. doi: 10.1093/bioinformatics/btaa808.

DOI:10.1093/bioinformatics/btaa808

PMID:33381830

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7773485/

Abstract

MOTIVATION

Disordered flexible linkers (DFLs) are abundant and functionally important intrinsically disordered regions that connect protein domains and structural elements within domains and which facilitate disorder-based allosteric regulation. Although computational estimates suggest that thousands of proteins have DFLs, they were annotated experimentally in <200 proteins. This substantial annotation gap can be reduced with the help of accurate computational predictors. The sole predictor of DFLs, DFLpred, trade-off accuracy for shorter runtime by excluding relevant but computationally costly predictive inputs. Moreover, it relies on the local/window-based information while lacking to consider useful protein-level characteristics.

RESULTS

We conceptualize, design and test APOD (Accurate Predictor Of DFLs), the first highly accurate predictor that utilizes both local- and protein-level inputs that quantify propensity for disorder, sequence composition, sequence conservation and selected putative structural properties. Consequently, APOD offers significantly more accurate predictions when compared with its faster predecessor, DFLpred, and several other alternative ways to predict DFLs. These improvements stem from the use of a more comprehensive set of inputs that cover the protein-level information and the application of a more sophisticated predictive model, a well-parametrized support vector machine. APOD achieves area under the curve = 0.82 (28% improvement over DFLpred) and Matthews correlation coefficient = 0.42 (180% increase over DFLpred) when tested on an independent/low-similarity test dataset. Consequently, APOD is a suitable choice for accurate and small-scale prediction of DFLs.

AVAILABILITY AND IMPLEMENTATION

https://yanglab.nankai.edu.cn/APOD/.

摘要

动机

无序柔性连接子（DFLs）是丰富且功能重要的内在无序区域，它们连接着蛋白质结构域和域内结构元件，并促进基于无序的变构调节。尽管计算预测表明数千种蛋白质具有 DFLs，但实际上仅在<200 种蛋白质中进行了实验注释。借助准确的计算预测器，可以减少这种大量的注释差距。唯一的 DFLs 预测器 DFLpred 通过排除相关但计算成本高的预测输入，在准确性和较短的运行时间之间进行权衡。此外，它依赖于局部/窗口信息，而缺乏考虑有用的蛋白质水平特征。

结果

我们设计并测试了 APOD（DFLs 的准确预测器），这是第一个利用无序倾向、序列组成、序列保守性和选定的假定结构特性等局部和蛋白质水平输入的高度准确的预测器。因此，与更快的前身 DFLpred 以及其他几种替代的 DFLs 预测方法相比，APOD 提供了更准确的预测。这些改进源于使用更全面的输入集来覆盖蛋白质水平信息，并应用更复杂的预测模型，即经过良好参数化的支持向量机。当在独立的低相似度测试数据集上进行测试时，APOD 的曲线下面积为 0.82（比 DFLpred 提高了 28%），马修斯相关系数为 0.42（比 DFLpred 提高了 180%）。因此，APOD 是准确和小规模预测 DFLs 的合适选择。

可用性和实现

https://yanglab.nankai.edu.cn/APOD/。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6ad/7773485/d35dc57a7a13/btaa808f1.jpg

相似文献

APOD: accurate sequence-based predictor of disordered flexible linkers.

Bioinformatics. 2020 Dec 30;36(Suppl_2):i754-i761. doi: 10.1093/bioinformatics/btaa808.

DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences.

Bioinformatics. 2016 Jun 15;32(12):i341-i350. doi: 10.1093/bioinformatics/btw280.

TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning.

Genomics Proteomics Bioinformatics. 2023 Apr;21(2):359-369. doi: 10.1016/j.gpb.2022.10.004. Epub 2022 Oct 19.

CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information.

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac502.

MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles.

J Theor Biol. 2018 Jan 21;437:9-16. doi: 10.1016/j.jtbi.2017.10.015. Epub 2017 Oct 16.

Computational prediction of functions of intrinsically disordered regions.

Prog Mol Biol Transl Sci. 2019;166:341-369. doi: 10.1016/bs.pmbts.2019.04.006. Epub 2019 May 20.

Accurate prediction of disorder in protein chains with a comprehensive and empirically designed consensus.

J Biomol Struct Dyn. 2014;32(3):448-64. doi: 10.1080/07391102.2013.775969. Epub 2013 Mar 27.

DEPICTER: Intrinsic Disorder and Disorder Function Prediction Server.

J Mol Biol. 2020 May 15;432(11):3379-3387. doi: 10.1016/j.jmb.2019.12.030. Epub 2019 Dec 21.

Assessment of Disordered Linker Predictions in the CAID2 Experiment.

Biomolecules. 2024 Feb 28;14(3):287. doi: 10.3390/biom14030287.

High-throughput prediction of disordered moonlighting regions in protein sequences.

Proteins. 2018 Oct;86(10):1097-1110. doi: 10.1002/prot.25590. Epub 2018 Sep 23.

引用本文的文献

Advancements in one-dimensional protein structure prediction using machine learning and deep learning.

Comput Struct Biotechnol J. 2025 Apr 3;27:1416-1430. doi: 10.1016/j.csbj.2025.04.005. eCollection 2025.

Prediction of Disordered Linkers Using APOD.

Methods Mol Biol. 2025;2867:219-231. doi: 10.1007/978-1-0716-4196-5_13.

Assessment of Disordered Linker Predictions in the CAID2 Experiment.

Biomolecules. 2024 Feb 28;14(3):287. doi: 10.3390/biom14030287.

Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.

Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023.

DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model.

BMC Biol. 2024 Jan 2;22(1):3. doi: 10.1186/s12915-023-01803-y.

IDP-LM: Prediction of protein intrinsic disorder and disorder functions based on language models.

PLoS Comput Biol. 2023 Nov 22;19(11):e1011657. doi: 10.1371/journal.pcbi.1011657. eCollection 2023 Nov.

CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins.

Nucleic Acids Res. 2023 Jul 5;51(W1):W62-W69. doi: 10.1093/nar/gkad430.

DMFpred: Predicting protein disorder molecular functions based on protein cubic language model.

PLoS Comput Biol. 2022 Oct 31;18(10):e1010668. doi: 10.1371/journal.pcbi.1010668. eCollection 2022 Oct.

TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning.

Genomics Proteomics Bioinformatics. 2023 Apr;21(2):359-369. doi: 10.1016/j.gpb.2022.10.004. Epub 2022 Oct 19.

Deep learning in prediction of intrinsic disorder in proteins.

Comput Struct Biotechnol J. 2022 Mar 8;20:1286-1294. doi: 10.1016/j.csbj.2022.03.003. eCollection 2022.

本文引用的文献

Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins.

Bioinformatics. 2020 Sep 15;36(18):4729-4738. doi: 10.1093/bioinformatics/btaa573.

Phosphorylation-Regulated Activation of the Arabidopsis RRS1-R/RPS4 Immune Receptor Complex Reveals Two Distinct Effector Recognition Mechanisms.

Cell Host Microbe. 2020 May 13;27(5):769-781.e6. doi: 10.1016/j.chom.2020.03.008. Epub 2020 Mar 31.

FUpred: detecting protein domains through deep-learning-based contact map prediction.

Bioinformatics. 2020 Jun 1;36(12):3749-3757. doi: 10.1093/bioinformatics/btaa217.

DEPICTER: Intrinsic Disorder and Disorder Function Prediction Server.

J Mol Biol. 2020 May 15;432(11):3379-3387. doi: 10.1016/j.jmb.2019.12.030. Epub 2019 Dec 21.

DisProt: intrinsic protein disorder annotation in 2020.

Nucleic Acids Res. 2020 Jan 8;48(D1):D269-D276. doi: 10.1093/nar/gkz975.

Integrating disorder in globular multidomain proteins: Fuzzy sensors and the role of SH3 domains.

Arch Biochem Biophys. 2019 Nov 30;677:108161. doi: 10.1016/j.abb.2019.108161. Epub 2019 Oct 31.

Effective concentrations enforced by intrinsically disordered linkers are governed by polymer physics.

Proc Natl Acad Sci U S A. 2019 Nov 12;116(46):23124-23131. doi: 10.1073/pnas.1904813116. Epub 2019 Oct 28.

Accuracy of protein-level disorder predictions.

Brief Bioinform. 2020 Sep 25;21(5):1509-1522. doi: 10.1093/bib/bbz100.

Critical assessment of methods of protein structure prediction (CASP)-Round XIII.

Proteins. 2019 Dec;87(12):1011-1020. doi: 10.1002/prot.25823. Epub 2019 Oct 23.

Computational prediction of functions of intrinsically disordered regions.

Prog Mol Biol Transl Sci. 2019;166:341-369. doi: 10.1016/bs.pmbts.2019.04.006. Epub 2019 May 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

APOD：基于序列的无序柔性接头准确预测器。

APOD: accurate sequence-based predictor of disordered flexible linkers.

机构信息

Center for Applied Mathematics, Tianjin University, Tianjin 300072, China.

School of Statistics and Data Science, Nankai University, Tianjin 300074, China.