Suppr超能文献

利用 PSSM 与递归神经网络相结合提高膜蛋白类型预测效率。

Efficient utilization on PSSM combining with recurrent neural network for membrane protein types prediction.

机构信息

Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, PR China.

Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, PR China.

出版信息

Comput Biol Chem. 2019 Aug;81:9-15. doi: 10.1016/j.compbiolchem.2019.107094. Epub 2019 Aug 8.

Abstract

Position-Specific Scoring Matrix (PSSM) is an excellent feature extraction method that was proposed early in protein classifying prediction, but within the restriction of feature shape in PSSM, researchers make a lot attempts to process it so that PSSM can be input to the traditional machine learning algorithms. These processes drop information provided by PSSM in a way thus the feature representation is limited. Moreover, the high-dimensional feature representation of PSSM makes it incompatible with other feature extraction methods. We use the PSSM as the input of Recurrent Neural Network without any post-processing, the amino acids in protein sequences are regarded as time step in RNN. This way takes full advantage of the information that PSSM provides. In this study, the PSSM is input to the model directly and the internal information of PSSM is fully utilized, we propose an end-to-end solution and achieve state-of-the-art performance. Ultimately, the exploration of how to combine PSSM with traditional feature extraction methods is carried out and achieve slightly improved performance. Our network architecture is implemented in Python and is available at https://github.com/YellowcardD/RNN-for-membrane-protein-types-prediction.

摘要

位置特异性评分矩阵(PSSM)是一种优秀的特征提取方法,它在蛋白质分类预测中很早就被提出了,但是由于 PSSM 中特征形状的限制,研究人员进行了很多尝试来处理它,以便 PSSM 可以输入到传统的机器学习算法中。这些过程以某种方式丢弃了 PSSM 提供的信息,从而限制了特征表示。此外,PSSM 的高维特征表示使其与其他特征提取方法不兼容。我们使用 PSSM 作为递归神经网络的输入,而无需进行任何后处理,蛋白质序列中的氨基酸被视为 RNN 的时间步。这样可以充分利用 PSSM 提供的信息。在这项研究中,我们直接将 PSSM 输入到模型中,并充分利用 PSSM 的内部信息,提出了一种端到端的解决方案,并取得了最先进的性能。最终,我们探索了如何将 PSSM 与传统的特征提取方法结合起来,并取得了略有提高的性能。我们的网络架构是用 Python 实现的,可在 https://github.com/YellowcardD/RNN-for-membrane-protein-types-prediction 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验