• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于聚类氨基酸和加权稀疏表示的蛋白质-蛋白质相互作用预测

Prediction of protein-protein interactions with clustered amino acids and weighted sparse representation.

作者信息

Huang Qiaoying, You Zhuhong, Zhang Xiaofeng, Zhou Yong

机构信息

Shenzhen Graduate School, Harbin Institute of Technology, HIT Campus of University Town of Shenzhen, Shenzhen 518055, China.

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.

出版信息

Int J Mol Sci. 2015 May 13;16(5):10855-69. doi: 10.3390/ijms160510855.

DOI:10.3390/ijms160510855
PMID:25984606
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4463679/
Abstract

With the completion of the Human Genome Project, bioscience has entered into the era of the genome and proteome. Therefore, protein-protein interactions (PPIs) research is becoming more and more important. Life activities and the protein-protein interactions are inseparable, such as DNA synthesis, gene transcription activation, protein translation, etc. Though many methods based on biological experiments and machine learning have been proposed, they all spent a long time to learn and obtained an imprecise accuracy. How to efficiently and accurately predict PPIs is still a big challenge. To take up such a challenge, we developed a new predictor by incorporating the reduced amino acid alphabet (RAAA) information into the general form of pseudo-amino acid composition (PseAAC) and with the weighted sparse representation-based classification (WSRC). The remarkable advantages of introducing the reduced amino acid alphabet is being able to avoid the notorious dimensionality disaster or overfitting problem in statistical prediction. Additionally, experiments have proven that our method achieved good performance in both a low- and high-dimensional feature space. Among all of the experiments performed on the PPIs data of Saccharomyces cerevisiae, the best one achieved 90.91% accuracy, 94.17% sensitivity, 87.22% precision and a 83.43% Matthews correlation coefficient (MCC) value. In order to evaluate the prediction ability of our method, extensive experiments are performed to compare with the state-of-the-art technique, support vector machine (SVM). The achieved results show that the proposed approach is very promising for predicting PPIs, and it can be a helpful supplement for PPIs prediction.

摘要

随着人类基因组计划的完成,生物科学已进入基因组和蛋白质组时代。因此,蛋白质-蛋白质相互作用(PPI)研究变得越来越重要。生命活动与蛋白质-蛋白质相互作用密不可分,如DNA合成、基因转录激活、蛋白质翻译等。尽管已经提出了许多基于生物学实验和机器学习的方法,但它们都花费了很长时间来学习,并且准确率不高。如何高效、准确地预测蛋白质-蛋白质相互作用仍然是一个巨大的挑战。为了应对这一挑战,我们通过将简化氨基酸字母表(RAAA)信息纳入伪氨基酸组成(PseAAC)的一般形式,并结合基于加权稀疏表示的分类(WSRC),开发了一种新的预测器。引入简化氨基酸字母表的显著优点是能够避免统计预测中臭名昭著的维度灾难或过拟合问题。此外,实验证明我们的方法在低维和高维特征空间中都取得了良好的性能。在对酿酒酵母的蛋白质-蛋白质相互作用数据进行的所有实验中,最佳结果的准确率达到90.91%,灵敏度达到94.17%,精确率达到87.22%,马修斯相关系数(MCC)值达到83.43%。为了评估我们方法的预测能力,进行了广泛的实验以与最先进的技术支持向量机(SVM)进行比较。所取得的结果表明,所提出的方法在预测蛋白质-蛋白质相互作用方面非常有前景,并且可以作为蛋白质-蛋白质相互作用预测的有益补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/094f/4463679/dc6f6a688ca9/ijms-16-10855-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/094f/4463679/dc6f6a688ca9/ijms-16-10855-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/094f/4463679/dc6f6a688ca9/ijms-16-10855-g001.jpg

相似文献

1
Prediction of protein-protein interactions with clustered amino acids and weighted sparse representation.基于聚类氨基酸和加权稀疏表示的蛋白质-蛋白质相互作用预测
Int J Mol Sci. 2015 May 13;16(5):10855-69. doi: 10.3390/ijms160510855.
2
Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition.通过结合连续小波描述符和伪氨基酸组成的加权稀疏表示模型改进蛋白质-蛋白质相互作用预测
BMC Syst Biol. 2016 Dec 23;10(Suppl 4):120. doi: 10.1186/s12918-016-0360-6.
3
Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.使用一种新颖的多尺度局部特征表示方案和随机森林从蛋白质一级序列预测蛋白质-蛋白质相互作用。
PLoS One. 2015 May 6;10(5):e0125811. doi: 10.1371/journal.pone.0125811. eCollection 2015.
4
Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier.通过整合PSSM概况中嵌入的潜在进化信息和判别向量机分类器来准确预测蛋白质-蛋白质相互作用。
Oncotarget. 2017 Apr 4;8(14):23638-23649. doi: 10.18632/oncotarget.15564.
5
Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding.基于序列的蛋白质-蛋白质相互作用预测:结合全局编码的加权稀疏表示模型
BMC Bioinformatics. 2016 Apr 26;17(1):184. doi: 10.1186/s12859-016-1035-4.
6
RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.RVMAB:使用相关向量机模型结合平均块从蛋白质序列预测蛋白质相互作用
Int J Mol Sci. 2016 May 18;17(5):757. doi: 10.3390/ijms17050757.
7
Predicting protein-protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach.通过融合各种周伪氨基酸组成成分并使用小波去噪方法来预测蛋白质-蛋白质相互作用。
J Theor Biol. 2019 Feb 7;462:329-346. doi: 10.1016/j.jtbi.2018.11.011. Epub 2018 Nov 16.
8
Improving protein-protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model.利用蛋白质进化信息和相关向量机模型提高蛋白质-蛋白质相互作用预测准确性
Protein Sci. 2016 Oct;25(10):1825-33. doi: 10.1002/pro.2991. Epub 2016 Aug 9.
9
Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set.利用新型多尺度连续和非连续特征集从氨基酸序列预测蛋白质-蛋白质相互作用。
BMC Bioinformatics. 2014;15 Suppl 15(Suppl 15):S9. doi: 10.1186/1471-2105-15-S15-S9. Epub 2014 Dec 3.
10
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.PCVMZM:使用概率分类向量机模型结合泽尼克矩描述符从蛋白质序列预测蛋白质-蛋白质相互作用
Int J Mol Sci. 2017 May 11;18(5):1029. doi: 10.3390/ijms18051029.

引用本文的文献

1
Determining human-coronavirus protein-protein interaction using machine intelligence.利用机器智能确定人类冠状病毒的蛋白质-蛋白质相互作用。
Med Nov Technol Devices. 2023 Jun;18:100228. doi: 10.1016/j.medntd.2023.100228. Epub 2023 Apr 6.
2
Study of key amino acid residues of GH66 dextranase for producing high-degree polymerized isomaltooligosaccharides and improving of thermostability.用于生产高聚合度异麦芽寡糖及提高热稳定性的GH66葡聚糖酶关键氨基酸残基的研究
Front Bioeng Biotechnol. 2022 Aug 10;10:961776. doi: 10.3389/fbioe.2022.961776. eCollection 2022.
3
Protein-Protein Interactions Prediction Based on Graph Energy and Protein Sequence Information.

本文引用的文献

1
PseDNA-Pro: DNA-Binding Protein Identification by Combining Chou's PseAAC and Physicochemical Distance Transformation.PseDNA-Pro:结合周氏伪氨基酸组成和物理化学距离变换的DNA结合蛋白鉴定方法
Mol Inform. 2015 Jan;34(1):8-17. doi: 10.1002/minf.201400025. Epub 2014 Sep 26.
2
Protein Remote Homology Detection by Combining Chou's Pseudo Amino Acid Composition and Profile-Based Protein Representation.结合周氏伪氨基酸组成和基于轮廓的蛋白质表示法进行蛋白质远程同源性检测。
Mol Inform. 2013 Oct;32(9-10):775-82. doi: 10.1002/minf.201300084. Epub 2013 Jul 24.
3
miRNA-dis: microRNA precursor identification based on distance structure status pairs.
基于图能量和蛋白质序列信息的蛋白质-蛋白质相互作用预测。
Molecules. 2020 Apr 16;25(8):1841. doi: 10.3390/molecules25081841.
4
An Ensemble Classifier to Predict Protein-Protein Interactions by Combining PSSM-based Evolutionary Information with Local Binary Pattern Model.基于 PSSM 进化信息与局部二值模式模型相结合的蛋白质-蛋白质相互作用预测的集成分类器。
Int J Mol Sci. 2019 Jul 17;20(14):3511. doi: 10.3390/ijms20143511.
5
An Improved Deep Forest Model for Predicting Self-Interacting Proteins From Protein Sequence Using Wavelet Transformation.一种基于小波变换从蛋白质序列预测自相互作用蛋白质的改进深度森林模型。
Front Genet. 2019 Mar 1;10:90. doi: 10.3389/fgene.2019.00090. eCollection 2019.
6
An improved approach to infer protein-protein interaction based on a hierarchical vector space model.基于层次向量空间模型的改进蛋白质-蛋白质相互作用推断方法。
BMC Bioinformatics. 2018 Apr 27;19(1):161. doi: 10.1186/s12859-018-2152-z.
7
PCLPred: A Bioinformatics Method for Predicting Protein-Protein Interactions by Combining Relevance Vector Machine Model with Low-Rank Matrix Approximation.PCLPred:一种通过结合关联向量机模型与低秩矩阵逼近的生物信息学方法,用于预测蛋白质-蛋白质相互作用。
Int J Mol Sci. 2018 Mar 29;19(4):1029. doi: 10.3390/ijms19041029.
8
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.PCVMZM:使用概率分类向量机模型结合泽尼克矩描述符从蛋白质序列预测蛋白质-蛋白质相互作用
Int J Mol Sci. 2017 May 11;18(5):1029. doi: 10.3390/ijms18051029.
9
Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences.使用结合局部相位量化的相关向量机模型从蛋白质序列预测蛋白质-蛋白质相互作用。
Biomed Res Int. 2016;2016:4783801. doi: 10.1155/2016/4783801. Epub 2016 May 23.
10
Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.基于带间隙二肽和递归特征选择方法的蛋白质结构类预测
Int J Mol Sci. 2015 Dec 24;17(1):15. doi: 10.3390/ijms17010015.
miRNA-dis:基于距离结构状态对的微小RNA前体识别
Mol Biosyst. 2015 Apr;11(4):1194-204. doi: 10.1039/c5mb00050e.
4
iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach.iMiRNA-PseDPC:基于伪距离对组合方法的 microRNA 前体识别。
J Biomol Struct Dyn. 2016;34(1):223-35. doi: 10.1080/07391102.2015.1014422. Epub 2015 Mar 3.
5
A highly efficient approach to protein interactome mapping based on collaborative filtering framework.一种基于协同过滤框架的蛋白质相互作用组图谱绘制的高效方法。
Sci Rep. 2015 Jan 9;5:7702. doi: 10.1038/srep07702.
6
repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.repDNA:一个 Python 包,通过结合用户定义的物理化学性质和序列顺序效应,为 DNA 序列生成各种模式的特征向量。
Bioinformatics. 2015 Apr 15;31(8):1307-9. doi: 10.1093/bioinformatics/btu820. Epub 2014 Dec 10.
7
iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.iDNA-Prot|dis:通过将氨基酸距离对和简化字母表概况纳入通用伪氨基酸组成来鉴定DNA结合蛋白。
PLoS One. 2014 Sep 3;9(9):e106691. doi: 10.1371/journal.pone.0106691. eCollection 2014.
8
Predicting the types of J-proteins using clustered amino acids.利用聚类氨基酸预测J蛋白的类型。
Biomed Res Int. 2014;2014:935719. doi: 10.1155/2014/935719. Epub 2014 Apr 2.
9
Protein-protein interaction detection: methods and analysis.蛋白质-蛋白质相互作用检测:方法与分析
Int J Proteomics. 2014;2014:147648. doi: 10.1155/2014/147648. Epub 2014 Feb 17.
10
Using distances between Top-n-gram and residue pairs for protein remote homology detection.使用 Top-n-gram 与残基对之间的距离进行蛋白质远程同源检测。
BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S3. doi: 10.1186/1471-2105-15-S2-S3. Epub 2014 Jan 24.