Huang Guohua, Zhang Yuchao, Chen Lei, Zhang Ning, Huang Tao, Cai Yu-Dong
Institute of Systems Biology, Shanghai University, Shanghai, China; Department of Mathematics, Shaoyang University, Shaoyang, Hunan, China.
Graduate School of the Chinese Academy of Sciences, Beijing, China; State Key Laboratory of Medical Genomics, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
PLoS One. 2014 Mar 27;9(3):e93553. doi: 10.1371/journal.pone.0093553. eCollection 2014.
Membrane proteins were found to be involved in various cellular processes performing various important functions, which are mainly associated to their types. However, it is very time-consuming and expensive for traditional biophysical methods to identify membrane protein types. Although some computational tools predicting membrane protein types have been developed, most of them can only recognize one kind of type. Therefore, they are not as effective as one membrane protein can have several types at the same time. To our knowledge, few methods handling multiple types of membrane proteins were reported. In this study, we proposed an integrated approach to predict multiple types of membrane proteins by employing sequence homology and protein-protein interaction network. As a result, the prediction accuracies reached 87.65%, 81.39% and 70.79%, respectively, by the leave-one-out test on three datasets. It outperformed the nearest neighbor algorithm adopting pseudo amino acid composition. The method is anticipated to be an alternative tool for identifying membrane protein types. New metrics for evaluating performances of methods dealing with multi-label problems were also presented. The program of the method is available upon request.
膜蛋白被发现参与各种细胞过程并执行各种重要功能,这些功能主要与其类型相关。然而,传统生物物理方法鉴定膜蛋白类型既耗时又昂贵。尽管已经开发了一些预测膜蛋白类型的计算工具,但大多数只能识别一种类型。因此,它们并不那么有效,因为一种膜蛋白可能同时具有多种类型。据我们所知,很少有处理多种类型膜蛋白的方法被报道。在本研究中,我们提出了一种综合方法,通过利用序列同源性和蛋白质-蛋白质相互作用网络来预测多种类型的膜蛋白。结果,通过对三个数据集进行留一法测试,预测准确率分别达到了87.65%、81.39%和70.79%。它优于采用伪氨基酸组成的最近邻算法。该方法有望成为鉴定膜蛋白类型的替代工具。还提出了用于评估处理多标签问题方法性能的新指标。该方法的程序可根据要求提供。