Dehzangi Abdollah, Heffernan Rhys, Sharma Alok, Lyons James, Paliwal Kuldip, Sattar Abdul
Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia; National ICT Australia (NICTA), Brisbane, Australia.
School of Engineering, Griffith University, Brisbane, Australia.
J Theor Biol. 2015 Jan 7;364:284-94. doi: 10.1016/j.jtbi.2014.09.029. Epub 2014 Sep 28.
Protein subcellular localization is defined as predicting the functioning location of a given protein in the cell. It is considered an important step towards protein function prediction and drug design. Recent studies have shown that relying on Gene Ontology (GO) for feature extraction can improve protein subcellular localization prediction performance. However, relying solely on GO, this problem remains unsolved. At the same time, the impact of other sources of features especially evolutionary-based features has not been explored adequately for this task. In this study, we aim to extract discriminative evolutionary features to tackle this problem. To do this, we propose two segmentation based feature extraction methods to explore potential local evolutionary-based information for Gram-positive and Gram-negative subcellular localizations. We will show that by applying a Support Vector Machine (SVM) classifier to our extracted features, we are able to enhance Gram-positive and Gram-negative subcellular localization prediction accuracies by up to 6.4% better than previous studies including the studies that used GO for feature extraction.
蛋白质亚细胞定位被定义为预测给定蛋白质在细胞中的功能位置。它被认为是迈向蛋白质功能预测和药物设计的重要一步。最近的研究表明,依靠基因本体论(GO)进行特征提取可以提高蛋白质亚细胞定位预测性能。然而,仅依靠GO,这个问题仍然没有得到解决。同时,对于这项任务,其他特征来源尤其是基于进化的特征的影响尚未得到充分探索。在本研究中,我们旨在提取有区分力的进化特征来解决这个问题。为此,我们提出了两种基于分割的特征提取方法,以探索革兰氏阳性和革兰氏阴性亚细胞定位潜在的基于局部进化的信息。我们将表明,通过将支持向量机(SVM)分类器应用于我们提取的特征,我们能够将革兰氏阳性和革兰氏阴性亚细胞定位预测准确率比以前的研究(包括使用GO进行特征提取的研究)提高多达6.4%。