Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran.
J Theor Biol. 2011 Jan 21;269(1):208-16. doi: 10.1016/j.jtbi.2010.10.026. Epub 2010 Oct 30.
In this study, the predictors are developed for protein submitochondria locations based on various features of sequences. Information about the submitochondria location for a mitochondria protein can provide much better understanding about its function. We use ten representative models of protein samples such as pseudo amino acid composition, dipeptide composition, functional domain composition, the combining discrete model based on prediction of solvent accessibility and secondary structure elements, the discrete model of pairwise sequence similarity, etc. We construct a predictor based on support vector machines (SVMs) for each representative model. The overall prediction accuracy by the leave-one-out cross validation test obtained by the predictor which is based on the discrete model of pairwise sequence similarity is 1% better than the best computational system that exists for this problem. Moreover, we develop a method based on ordered weighted averaging (OWA) which is one of the fusion data operators. Therefore, OWA is applied on the 11 best SVM-based classifiers that are constructed based on various features of sequence. This method is called Mito-Loc. The overall leave-one-out cross validation accuracy obtained by Mito-Loc is about 95%. This indicates that our proposed approach (Mito-Loc) is superior to the result of the best existing approach which has already been reported.
在这项研究中,我们基于序列的各种特征开发了用于预测蛋白质亚线粒体位置的模型。了解线粒体蛋白的亚线粒体位置可以更好地理解其功能。我们使用了十个有代表性的蛋白质样本模型,如伪氨基酸组成、二肽组成、功能域组成、基于溶剂可及性和二级结构元素预测的离散组合模型、成对序列相似性的离散模型等。我们为每个代表性模型构建了一个基于支持向量机(SVM)的预测器。通过基于成对序列相似性离散模型的预测器进行的留一法交叉验证测试获得的总体预测准确性比针对该问题存在的最佳计算系统好 1%。此外,我们开发了一种基于有序加权平均(OWA)的方法,OWA 是一种融合数据运算符。因此,OWA 应用于基于序列各种特征构建的 11 个最佳基于 SVM 的分类器上。该方法称为 Mito-Loc。通过 Mito-Loc 获得的总体留一法交叉验证准确性约为 95%。这表明,我们提出的方法(Mito-Loc)优于已报道的最佳现有方法的结果。