Suppr超能文献

通过具有三种新型序列特征的深度神经网络识别内在无序蛋白质区域。

Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features.

作者信息

Zhao Jiaxiang, Wang Zengke

机构信息

College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China.

出版信息

Life (Basel). 2022 Feb 26;12(3):345. doi: 10.3390/life12030345.

Abstract

The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features-i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence-are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features-i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence-could be used as valuable sequence features in the further development of identifying IDPRs.

摘要

快速、可靠且准确地识别内在无序蛋白质区域(IDPRs)至关重要,因为近年来人们越来越认识到,IDPRs对许多重要的生理过程有广泛影响,如分子识别与分子组装、转录与翻译调控、蛋白质磷酸化、细胞信号转导等。为了实现成本效益,开发用于识别IDPRs的计算方法势在必行。在本研究中,构建了一种深度神经结构,其中一个变体VGG19位于两个多层感知器(MLP)网络之间,用于识别IDPRs。此外,首次引入了三种新颖的序列特征,即持久熵以及与蛋白质序列中两个和三个连续氨基酸相关的概率,用于识别IDPRs。模拟结果表明,我们的神经结构要么比其他已知方法表现得好得多,要么在依赖小得多的训练集时能达到相似的性能。我们利用VGG19结构的深度神经结构对于识别IDPRs是有效的。此外,三种新颖的序列特征,即持久熵以及与蛋白质序列中两个和三个连续氨基酸相关的概率,可作为识别IDPRs进一步发展中有价值的序列特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e9e/8950681/52fe4d4b2064/life-12-00345-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验