PredIDR2：通过更新深度卷积神经网络和补充DisProt数据提高蛋白质内在无序预测的准确性。

PredIDR2: Improving accuracy of protein intrinsic disorder prediction by updating deep convolutional neural network and supplementing DisProt data.

作者信息

Han Kun-Sop, Kim Ha-Kyong, Kim Myong-Hyok, Pak Myong-Hyon, Pak Song-Jin, Choe Mun-Myong, Kim Chol-Song

机构信息

University of Sciences, Pyongyang, Democratic People's Republic of Korea.

Branch of Biotechnology, State Academy of Sciences, Pyongyang, Democratic People's Republic of Korea.

出版信息

Int J Biol Macromol. 2025 May;306(Pt 4):141801. doi: 10.1016/j.ijbiomac.2025.141801. Epub 2025 Mar 5.

DOI:10.1016/j.ijbiomac.2025.141801

PMID:40054813

Abstract

Intrinsically disordered proteins (IDPs) or regions (IDRs) are widespread in proteomes, and involved in several important biological processes and implicated in many diseases. Many computational methods for IDR prediction are being developed to decrease the gap between the low speed of experimental determination of annotated proteins and the rapid increase of non-annotated proteins, and their performances are blindly tested by the community-driven experiment, the Critical Assessment of protein Intrinsic Disorder (CAID). In this paper, we developed PredIDR2 series, an updated version of PredIDR tested in CAID2 in order to accurately predict intrinsically disordered regions from protein sequence. It includes four methods depending on the input features and the producing mode of the negative samples of the training set. PredIDR2 series (AUC_ROC = 0.952) perform remarkably better than our previous PredIDR (AUC_ROC = 0.933) for Disorder-PDB dataset of CAID2, which seems to be mainly attributed to the introduction of a new deep convolutional neural network and the augmentation of the training data, especially from DisProt database. PredIDR2 series outperform the state-of-the-art IDR prediction methods participated in CAID2 in terms of AUC_ROC, AUC_PR and DC_mae and belong to the seven top-performing methods in terms of MCC. PredIDR2 series can be freely used through the CAID Prediction Portal available at https://caid.idpcentral.org/portal or downloaded as a Singularity container from https://biocomputingup.it/shared/caid-predictors/.

摘要

内在无序蛋白质（IDP）或区域（IDR）在蛋白质组中广泛存在，参与多种重要生物过程，并与许多疾病相关。目前正在开发许多用于IDR预测的计算方法，以缩小已注释蛋白质实验测定速度较慢与未注释蛋白质快速增加之间的差距，并且其性能通过社区驱动的实验——蛋白质内在无序关键评估（CAID）进行盲目测试。在本文中，我们开发了PredIDR2系列，这是在CAID2中测试的PredIDR的更新版本，以便从蛋白质序列中准确预测内在无序区域。它包括四种方法，具体取决于输入特征和训练集负样本的生成模式。对于CAID2的Disorder-PDB数据集，PredIDR2系列（AUC_ROC = 0.952）的表现明显优于我们之前的PredIDR（AUC_ROC = 0.933），这似乎主要归因于新的深度卷积神经网络的引入和训练数据的增加，特别是来自DisProt数据库的数据。在AUC_ROC、AUC_PR和DC_mae方面，PredIDR2系列优于参与CAID2的最先进的IDR预测方法，在MCC方面属于表现最佳的七种方法之一。可以通过https://caid.idpcentral.org/portal上的CAID预测门户免费使用PredIDR2系列，也可以从https://biocomputingup.it/shared/caid-predictors/下载为Singularity容器。

相似文献

PredIDR2: Improving accuracy of protein intrinsic disorder prediction by updating deep convolutional neural network and supplementing DisProt data.PredIDR2：通过更新深度卷积神经网络和补充DisProt数据提高蛋白质内在无序预测的准确性。

Int J Biol Macromol. 2025 May;306(Pt 4):141801. doi: 10.1016/j.ijbiomac.2025.141801. Epub 2025 Mar 5.

PredIDR: Accurate prediction of protein intrinsic disorder regions using deep convolutional neural network.PredIDR：使用深度卷积神经网络准确预测蛋白质内在无序区域。

Int J Biol Macromol. 2025 Jan;284(Pt 1):137665. doi: 10.1016/j.ijbiomac.2024.137665. Epub 2024 Nov 19.

Critical assessment of protein intrinsic disorder prediction (CAID) - Results of round 2.蛋白质固有无序预测（CAID）的批判性评估——第 2 轮结果。

Proteins. 2023 Dec;91(12):1925-1934. doi: 10.1002/prot.26582. Epub 2023 Aug 25.

Computational Prediction of Linear Interacting Peptides.线性相互作用肽的计算预测。

Methods Mol Biol. 2025;2867:233-245. doi: 10.1007/978-1-0716-4196-5_14.

PUNCH2: Explore the strategy for intrinsically disordered protein predictor.PUNCH2：探索内在无序蛋白质预测器的策略。

PLoS One. 2025 Mar 26;20(3):e0319208. doi: 10.1371/journal.pone.0319208. eCollection 2025.

cnnAlpha: Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks.cnnAlpha：通过简化氨基酸字母表和卷积神经网络进行蛋白质无序区域预测

Proteins. 2020 Nov;88(11):1472-1481. doi: 10.1002/prot.25966. Epub 2020 Aug 7.

Critical assessment of protein intrinsic disorder prediction.蛋白质固有无序预测的关键评估。

Nat Methods. 2021 May;18(5):472-481. doi: 10.1038/s41592-021-01117-3. Epub 2021 Apr 19.

Accurate and Fast Prediction of Intrinsic Disorder Using flDPnn.使用 flDPnn 进行精确快速的固有无序预测。

Methods Mol Biol. 2025;2867:201-218. doi: 10.1007/978-1-0716-4196-5_12.

flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins.flDPnn2：一种准确快速预测蛋白质内无序的方法。

J Mol Biol. 2024 Sep 1;436(17):168605. doi: 10.1016/j.jmb.2024.168605. Epub 2024 May 8.

DisorderUnetLM: Validating ProteinUnet for efficient protein intrinsic disorder prediction.DisorderUnetLM：验证用于高效蛋白质内在无序预测的ProteinUnet。

Comput Biol Med. 2025 Feb;185:109586. doi: 10.1016/j.compbiomed.2024.109586. Epub 2024 Dec 20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PredIDR2：通过更新深度卷积神经网络和补充DisProt数据提高蛋白质内在无序预测的准确性。

PredIDR2: Improving accuracy of protein intrinsic disorder prediction by updating deep convolutional neural network and supplementing DisProt data.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献