Suppr超能文献

DeepDRP:基于来自 Transformer 增强和蛋白质信息的集成视图深度学习架构预测无规则区域。

DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information.

机构信息

School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China.

Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; School of Artificial Intelligence, Jilin University, Changchun 130012, China.

出版信息

Int J Biol Macromol. 2023 Dec 31;253(Pt 6):127390. doi: 10.1016/j.ijbiomac.2023.127390. Epub 2023 Oct 11.

Abstract

Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP.

摘要

蛋白质中的内源性无序,这是自然界中广泛存在的一种现象,与许多关键的生物过程和各种疾病都有关联。传统的测定方法往往既昂贵又耗费劳力,因此,人们希望找到一种准确的内源性无序蛋白(IDP)鉴定方法。在本文中,我们提出了一种名为 DeepDRP 的新型深度学习模型,用于预测蛋白质中的内源性无序区域。DeepDRP 采用了创新的时间分布式策略和 Bi-LSTM 架构,由 PSSM、基于能量的编码、AAindex 以及包括 DR-BERT、OntoProtein、Prot-T5 和 ESM-2 在内的基于转换器的增强型嵌入的综合视图特征驱动。不同特征组合的比较表明,与传统特征相比,基于转换器的特征对预测 IDP 更有帮助,而 ESM-2 在预训练融合向量中贡献更大。消融试验验证了时间分布式策略确实提高了模型性能,是一种有效的 IDP 预测方法。与 DISORDER723、S1 和 DisProt832 数据集上的八种最先进的方法相比,DeepDRP 的马修斯相关系数在预测 IDP 方面明显优于竞争方法 4.90%至 36.20%、11.80%至 26.33%和 4.82%至 13.55%。总之,DeepDRP 是一种可靠的 IDP 预测模型,可在 https://github.com/ZX-COLA/DeepDRP 上免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验