Ni Qingshan, Zou Lingyun
Department of Microbiology, College of Basic Medical Sciences, Third Military Medical University, No. 30 Gaotanyan Road, Shapingba District, Chongqing 400038, P. R. China.
J Bioinform Comput Biol. 2014 Feb;12(1):1450003. doi: 10.1142/S0219720014500036. Epub 2014 Jan 7.
Outer membrane proteins (OMPs) play critical roles in many cellular processes and discriminating OMPs from other types of proteins is very important for OMPs identification in bacterial genomic proteins. In this study, a method SSEA_SVM is developed using secondary structure element alignment and support vector machine. Moreover, a novel kernel function is designed to utilize secondary structure information in the support vector machine classifier. A benchmark dataset, which consists of 208 OMPs, 673 globular proteins, and 206 α-helical membrane proteins, is used to evaluate the performance of SSEA_SVM. A high accuracy of 97.7% with 0.926 MCC is achieved while SSEA_SVM is applied to discriminating OMPs and non-OMPs. In comparison with existing methods in the literature, SSEA_SVM is also highly competitive. We suggest that SSEA_SVM is a much more promising method to identify OMPs in genomic proteins. A web server that implements SSEA_SVM is freely available at http://bioinfo.tmmu.edu.cn/SSEA_SVM/.
外膜蛋白(OMPs)在许多细胞过程中发挥着关键作用,从其他类型的蛋白质中区分出OMPs对于在细菌基因组蛋白质中鉴定OMPs非常重要。在本研究中,利用二级结构元件比对和支持向量机开发了一种方法SSEA_SVM。此外,还设计了一种新颖的核函数,以便在支持向量机分类器中利用二级结构信息。一个由208个OMPs、673个球状蛋白和206个α-螺旋膜蛋白组成的基准数据集被用于评估SSEA_SVM的性能。当使用SSEA_SVM来区分OMPs和非OMPs时,获得了97.7%的高精度和0.926的马修斯相关系数(MCC)。与文献中现有的方法相比,SSEA_SVM也具有很强的竞争力。我们认为SSEA_SVM是一种在基因组蛋白质中鉴定OMPs更有前景的方法。一个实现SSEA_SVM的网络服务器可在http://bioinfo.tmmu.edu.cn/SSEA_SVM/免费获取。