利用蛋白质相互作用网络预测蛋白质亚线粒体定位

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks.

作者信息

Hoseini Adele Sadat Haghighat, Mirzarezaee Mitra

机构信息

Department of Computer Engineering, Science and Research branch, Islamic Azad University, Tehran, Iran.

School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.

出版信息

Iran J Biotechnol. 2018 Aug 11;16(3):e1933. doi: 10.15171/ijb.1933. eCollection 2018 Aug.

DOI:10.15171/ijb.1933

PMID:31457027

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6697825/

Abstract

BACKGROUND

Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from protein sequences. In contrast, protein interactions have been less investigated.

OBJECTIVES

As protein interactions usually occur in the same or adjacent places, using this feature to find the location would be efficient and impressive. This study did not aim at increasing the total accuracy of the conducted research. The study has focused on the features of the proteins' interaction and their employment which lead to a higher accuracy.

MATERIALS AND METHODS

In this study, we have examined the protein interaction network as one of the features for prediction of the protein localization and its effects on the prediction results. In this regards, we have gathered some of the most common features including Amino Acid Composition, Dipeptide Compositions, Pseudo Amino Acid Compositions (PseAAC), Position Specific Scoring Matrix (PSSM), Functional Domain, Gene Ontology information, and the Pair-wise sequence alignment. The results of the classification are compared to the ones using protein interactions. For achieving this goal different machine learning algorithms were tested.

RESULTS

The best-obtained results of using single feature set obtained using SVM classifier for PseAAC feature. The accuracy of combining all features with PPI data, using the Decision Tree and Random Forest classifiers, was 82.49% and 83.35%, respectively. In another experiment, using just protein interaction data with the different cutting points resulted in obtaining an accuracy of 93.035% for the protein location prediction.

CONCLUSION

In total, it was shown that protein(s) interaction has a significant impact on the prediction of the mitochondrial proteins' location. This feature can separately distinguish the locations well. Using this feature the accuracy of the results is raised up to 5%.

摘要

背景

蛋白质定位预测是生物信息学中最重要的问题之一，用于预测细胞和细胞器（如线粒体）中的蛋白质。在本研究中，应用了几种机器学习算法来预测细胞内蛋白质的位置。这些算法使用从蛋白质序列中提取的特征。相比之下，蛋白质相互作用的研究较少。

目的

由于蛋白质相互作用通常发生在相同或相邻的位置，利用这一特征来确定位置将是高效且令人印象深刻的。本研究并非旨在提高所开展研究的总体准确性。该研究专注于蛋白质相互作用的特征及其应用，从而实现更高的准确性。

材料与方法

在本研究中，我们将蛋白质相互作用网络作为预测蛋白质定位的特征之一进行了研究，并探讨了其对预测结果的影响。为此，我们收集了一些最常见的特征，包括氨基酸组成、二肽组成、伪氨基酸组成（PseAAC）、位置特异性评分矩阵（PSSM）、功能域、基因本体信息以及成对序列比对。将分类结果与使用蛋白质相互作用的结果进行比较。为实现这一目标，测试了不同的机器学习算法。

结果

使用支持向量机（SVM）分类器对PseAAC特征获得的单特征集取得了最佳结果。使用决策树和随机森林分类器将所有特征与蛋白质-蛋白质相互作用（PPI）数据相结合时，准确率分别为82.49%和83.35%。在另一项实验中，仅使用具有不同切点的蛋白质相互作用数据，蛋白质定位预测的准确率达到了93.035%。

结论

总体而言，研究表明蛋白质相互作用对线粒体蛋白质定位的预测有显著影响。这一特征能够很好地分别区分不同位置。利用这一特征，结果的准确率提高了5%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51aa/6697825/8a272d6bf6cd/ijb-2018-03-e1933-g001.jpg

相似文献

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks.

Iran J Biotechnol. 2018 Aug 11;16(3):e1933. doi: 10.15171/ijb.1933. eCollection 2018 Aug.

Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou's pseudo-amino acid composition.

J Theor Biol. 2018 Aug 7;450:86-103. doi: 10.1016/j.jtbi.2018.04.026. Epub 2018 Apr 18.

Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising.

Oncotarget. 2017 Nov 21;8(64):107640-107665. doi: 10.18632/oncotarget.22585. eCollection 2017 Dec 8.

Prediction of protein subcellular localization with oversampling approach and Chou's general PseAAC.

J Theor Biol. 2018 Jan 21;437:239-250. doi: 10.1016/j.jtbi.2017.10.030. Epub 2017 Oct 31.

Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition.

Amino Acids. 2012 Apr;42(4):1443-54. doi: 10.1007/s00726-011-0888-0. Epub 2011 Mar 29.

CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition.

Comput Biol Chem. 2011 Aug 10;35(4):218-29. doi: 10.1016/j.compbiolchem.2011.05.003. Epub 2011 May 27.

Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.

Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.

Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.

BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.

Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab012.

Plant-mSubP: a computational framework for the prediction of single- and multi-target protein subcellular localization using integrated machine-learning approaches.

AoB Plants. 2019 Oct 17;12(3):plz068. doi: 10.1093/aobpla/plz068. eCollection 2020 Jun.

引用本文的文献

SubmitoLoc: Identification of mitochondrial sub cellular locations of proteins using support vector machine.

Bioinformation. 2019 Dec 31;15(12):863-868. doi: 10.6026/97320630015863. eCollection 2019.

本文引用的文献

Prediction of relative solvent accessibility by support vector regression and best-first method.

EXCLI J. 2010 Feb 8;9:29-38. eCollection 2010.

rpiCOOL: A tool for In Silico RNA-protein interaction detection using random forest.

J Theor Biol. 2016 Aug 7;402:1-8. doi: 10.1016/j.jtbi.2016.04.025. Epub 2016 Apr 28.

Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.

J Membr Biol. 2016 Jun;249(3):293-304. doi: 10.1007/s00232-015-9868-8. Epub 2016 Jan 8.

LocFuse: human protein-protein interaction prediction via classifier fusion using protein localization information.

Genomics. 2014 Dec;104(6 Pt B):496-503. doi: 10.1016/j.ygeno.2014.10.006. Epub 2014 Oct 16.

Using over-represented tetrapeptides to predict protein submitochondria locations.

Acta Biotheor. 2013 Jun;61(2):259-68. doi: 10.1007/s10441-013-9181-9. Epub 2013 Mar 10.

Predicting protein submitochondria locations by combining different descriptors into the general form of Chou's pseudo amino acid composition.

Amino Acids. 2012 Aug;43(2):545-55. doi: 10.1007/s00726-011-1143-4. Epub 2011 Nov 20.

Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization.

J Theor Biol. 2012 Jan 21;293:121-30. doi: 10.1016/j.jtbi.2011.10.015. Epub 2011 Oct 21.

CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition.

Comput Biol Chem. 2011 Aug 10;35(4):218-29. doi: 10.1016/j.compbiolchem.2011.05.003. Epub 2011 May 27.

Recent progress in predicting protein sub-subcellular locations.

Expert Rev Proteomics. 2011 Jun;8(3):391-404. doi: 10.1586/epr.11.20.

Identify submitochondria and subchloroplast locations with pseudo amino acid composition: approach from the strategy of discrete wavelet transform feature extraction.

Biochim Biophys Acta. 2011 Mar;1813(3):424-30. doi: 10.1016/j.bbamcr.2011.01.011. Epub 2011 Jan 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用蛋白质相互作用网络预测蛋白质亚线粒体定位

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks.

作者信息

Hoseini Adele Sadat Haghighat, Mirzarezaee Mitra

机构信息

Department of Computer Engineering, Science and Research branch, Islamic Azad University, Tehran, Iran.

School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.