College of Science, Inner Mongolia Agricultural University, Hohhot 010018, China.
Biomed Res Int. 2020 Aug 2;2020:9701734. doi: 10.1155/2020/9701734. eCollection 2020.
There are a lot of bacteria in the environment, and Gram-positive bacteria are the most common ones. Some Gram-positive bacteria are very harmful to the human body, so it is significant to predict Gram-positive bacterial protein subcellular location. And identification of Gram-positive bacterial protein subcellular location is important for developing effective drugs. In this paper, a new Gram-positive bacterial protein subcellular location dataset was established. The amino acid composition, the gene ontology annotation information, the hydropathy dipeptide composition information, the amino acid dipeptide composition information, and the autocovariance average chemical shift information were selected as characteristic parameters, then these parameters were combined. The locations of Gram-positive bacterial proteins were predicted by the Support Vector Machine (SVM) algorithm, and the overall accuracy (OA) reached 86.1% under the Jackknife test. The overall accuracy (OA) in our predictive model was higher than those in existing methods. This improved method may be helpful for protein function prediction.
环境中存在大量细菌,其中革兰氏阳性菌最为常见。一些革兰氏阳性菌对人体非常有害,因此预测革兰氏阳性菌蛋白质的亚细胞定位具有重要意义。鉴定革兰氏阳性菌蛋白质的亚细胞定位对于开发有效的药物也很重要。本文建立了一个新的革兰氏阳性菌蛋白质亚细胞定位数据集。选择了氨基酸组成、基因本体论注释信息、亲水性二肽组成信息、氨基酸二肽组成信息和自协方差平均化学位移信息作为特征参数,然后将这些参数组合起来。利用支持向量机(SVM)算法对革兰氏阳性菌蛋白的位置进行预测,Jackknife 检验下的总体准确率(OA)达到 86.1%。在我们的预测模型中,总体准确率(OA)高于现有方法。这种改进的方法可能有助于蛋白质功能预测。