Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, People's Republic of China.
School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510006, People's Republic of China.
BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):700. doi: 10.1186/s12859-019-3275-6.
Membrane proteins play an important role in the life activities of organisms. Knowing membrane protein types provides clues for understanding the structure and function of proteins. Though various computational methods for predicting membrane protein types have been developed, the results still do not meet the expectations of researchers.
We propose two deep learning models to process sequence information and evolutionary information, respectively. Both models obtained better results than traditional machine learning models. Furthermore, to improve the performance of the sequence information model, we also provide a new vector representation method to replace the one-hot encoding, whose overall success rate improved by 3.81% and 6.55% on two datasets. Finally, a more effective model is obtained by fusing the above two models, whose overall success rate reached 95.68% and 92.98% on two datasets.
The final experimental results show that our method is more effective than existing methods for predicting membrane protein types, which can help laboratory researchers to identify the type of novel membrane proteins.
膜蛋白在生物的生命活动中起着重要作用。了解膜蛋白的类型为理解蛋白质的结构和功能提供了线索。尽管已经开发出了各种用于预测膜蛋白类型的计算方法,但结果仍未达到研究人员的期望。
我们分别提出了两种深度学习模型来处理序列信息和进化信息。这两种模型都取得了比传统机器学习模型更好的结果。此外,为了提高序列信息模型的性能,我们还提供了一种新的向量表示方法来替换独热编码,在两个数据集上的总体成功率分别提高了 3.81%和 6.55%。最后,通过融合上述两种模型得到了一个更有效的模型,在两个数据集上的总体成功率分别达到了 95.68%和 92.98%。
最终的实验结果表明,我们的方法比现有的预测膜蛋白类型的方法更有效,可以帮助实验室研究人员识别新型膜蛋白的类型。