• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于朴素贝叶斯分类器的异源二聚体蛋白质复合物鉴定

Heterodimeric protein complex identification by naïve Bayes classifiers.

机构信息

Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan.

出版信息

BMC Bioinformatics. 2013 Dec 3;14:347. doi: 10.1186/1471-2105-14-347.

DOI:10.1186/1471-2105-14-347
PMID:24299017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4219333/
Abstract

BACKGROUND

Protein complexes are basic cellular entities that carry out the functions of their components. It can be found that in databases of protein complexes of yeast like CYC2008, the major type of known protein complexes is heterodimeric complexes. Although a number of methods for trying to predict sets of proteins that form arbitrary types of protein complexes simultaneously have been proposed, it can be found that they often fail to predict heterodimeric complexes.

RESULTS

In this paper, we have designed several features characterizing heterodimeric protein complexes based on genomic data sets, and proposed a supervised-learning method for the prediction of heterodimeric protein complexes. This method learns the parameters of the features, which are embedded in the naïve Bayes classifier. The log-likelihood ratio derived from the naïve Bayes classifier with the parameter values obtained by maximum likelihood estimation gives the score of a given pair of proteins to predict whether the pair is a heterodimeric complex or not. A five-fold cross-validation shows good performance on yeast. The trained classifiers also show higher predictability than various existing algorithms on yeast data sets with approximate and exact matching criteria.

CONCLUSIONS

Heterodimeric protein complex prediction is a rather harder problem than heteromeric protein complex prediction because heterodimeric protein complex is topologically simpler. However, it turns out that by designing features specialized for heterodimeric protein complexes, predictability of them can be improved. Thus, the design of more sophisticate features for heterodimeric protein complexes as well as the accumulation of more accurate and useful genome-wide data sets will lead to higher predictability of heterodimeric protein complexes. Our tool can be downloaded from http://imi.kyushu-u.ac.jp/~om/.

摘要

背景

蛋白质复合物是执行其组成成分功能的基本细胞实体。可以发现,在像 CYC2008 这样的酵母蛋白质复合物数据库中,已知蛋白质复合物的主要类型是异源二聚体复合物。尽管已经提出了许多试图同时预测形成任意类型蛋白质复合物的蛋白质集合的方法,但可以发现它们经常无法预测异源二聚体复合物。

结果

在本文中,我们基于基因组数据集设计了几种特征来描述异源二聚体蛋白质复合物,并提出了一种用于预测异源二聚体蛋白质复合物的有监督学习方法。该方法学习特征的参数,这些参数嵌入在朴素贝叶斯分类器中。从具有通过最大似然估计获得的参数值的朴素贝叶斯分类器导出的对数似然比给出了给定蛋白质对的分数,以预测该对是否为异源二聚体复合物。五折交叉验证在酵母上表现出良好的性能。经过训练的分类器在具有近似和精确匹配标准的酵母数据集上也显示出比各种现有算法更高的可预测性。

结论

异源二聚体蛋白质复合物的预测比异源蛋白质复合物的预测更为困难,因为异源二聚体蛋白质复合物的拓扑结构更简单。然而,事实证明,通过设计专门用于异源二聚体蛋白质复合物的特征,可以提高其可预测性。因此,设计更复杂的异源二聚体蛋白质复合物特征以及积累更准确和有用的全基因组数据集将导致更高的异源二聚体蛋白质复合物的可预测性。我们的工具可以从 http://imi.kyushu-u.ac.jp/~om/ 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/5bbd0bd4866e/1471-2105-14-347-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/369e3a0a9a95/1471-2105-14-347-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/c0a4912fc5b7/1471-2105-14-347-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/829887131373/1471-2105-14-347-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/31e498311ccf/1471-2105-14-347-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/fa752e7e6941/1471-2105-14-347-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/c487a7f342e1/1471-2105-14-347-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/5bbd0bd4866e/1471-2105-14-347-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/369e3a0a9a95/1471-2105-14-347-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/c0a4912fc5b7/1471-2105-14-347-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/829887131373/1471-2105-14-347-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/31e498311ccf/1471-2105-14-347-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/fa752e7e6941/1471-2105-14-347-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/c487a7f342e1/1471-2105-14-347-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ef3/4219333/5bbd0bd4866e/1471-2105-14-347-10.jpg

相似文献

1
Heterodimeric protein complex identification by naïve Bayes classifiers.基于朴素贝叶斯分类器的异源二聚体蛋白质复合物鉴定
BMC Bioinformatics. 2013 Dec 3;14:347. doi: 10.1186/1471-2105-14-347.
2
RocSampler: regularizing overlapping protein complexes in protein-protein interaction networks.RocSampler:在蛋白质-蛋白质相互作用网络中对重叠蛋白质复合物进行正则化
BMC Bioinformatics. 2017 Dec 6;18(Suppl 15):491. doi: 10.1186/s12859-017-1920-5.
3
Sampling strategy for protein complex prediction using cluster size frequency.基于簇大小频率的蛋白质复合物预测抽样策略。
Gene. 2013 Apr 10;518(1):152-8. doi: 10.1016/j.gene.2012.11.050. Epub 2012 Dec 9.
4
Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.基于新型特征和核函数的加权蛋白质-蛋白质相互作用网络预测异源二聚体蛋白复合物。
PLoS One. 2013 Jun 11;8(6):e65265. doi: 10.1371/journal.pone.0065265. Print 2013.
5
Predicting physical interactions between protein complexes.预测蛋白质复合物之间的物理相互作用。
Mol Cell Proteomics. 2013 Jun;12(6):1723-34. doi: 10.1074/mcp.O112.019828. Epub 2013 Feb 25.
6
Evaluation of different biological data and computational classification methods for use in protein interaction prediction.用于蛋白质相互作用预测的不同生物学数据和计算分类方法的评估。
Proteins. 2006 May 15;63(3):490-500. doi: 10.1002/prot.20865.
7
Supervised maximum-likelihood weighting of composite protein networks for complex prediction.用于复杂预测的复合蛋白质网络的监督最大似然加权
BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S13. doi: 10.1186/1752-0509-6-S2-S13. Epub 2012 Dec 12.
8
Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.利用组合对预测异源二聚体蛋白复合物进行改进,使用成对核函数。
BMC Bioinformatics. 2018 Feb 19;19(Suppl 1):39. doi: 10.1186/s12859-018-2017-5.
9
A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality.一份高精度的酵母蛋白质复合物共识图谱揭示了基因必需性的模块化性质。
BMC Bioinformatics. 2007 Jul 2;8:236. doi: 10.1186/1471-2105-8-236.
10
Better prediction of protein cellular localization sites with the k nearest neighbors classifier.使用k近邻分类器更好地预测蛋白质细胞定位位点。
Proc Int Conf Intell Syst Mol Biol. 1997;5:147-52.

引用本文的文献

1
An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction.用于改进天然-非天然蛋白质-蛋白质相互作用预测的集成分类器。
Int J Mol Sci. 2024 May 29;25(11):5957. doi: 10.3390/ijms25115957.
2
Identification of consensus biomarkers for predicting non-genotoxic hepatocarcinogens.鉴定用于预测非遗传毒性肝癌变物的共识生物标志物。
Sci Rep. 2017 Jan 24;7:41176. doi: 10.1038/srep41176.

本文引用的文献

1
Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.基于新型特征和核函数的加权蛋白质-蛋白质相互作用网络预测异源二聚体蛋白复合物。
PLoS One. 2013 Jun 11;8(6):e65265. doi: 10.1371/journal.pone.0065265. Print 2013.
2
Sampling strategy for protein complex prediction using cluster size frequency.基于簇大小频率的蛋白质复合物预测抽样策略。
Gene. 2013 Apr 10;518(1):152-8. doi: 10.1016/j.gene.2012.11.050. Epub 2012 Dec 9.
3
Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty.
通过探索术语下的本体和建模不确定性来改进 GO 语义相似性度量。
Bioinformatics. 2012 May 15;28(10):1383-9. doi: 10.1093/bioinformatics/bts129. Epub 2012 Apr 19.
4
NWE: Node-weighted expansion for protein complex prediction using random walk distances.NWE:基于随机游走距离的节点加权扩展的蛋白质复合物预测方法。
Proteome Sci. 2011 Oct 14;9 Suppl 1(Suppl 1):S14. doi: 10.1186/1477-5956-9-S1-S14.
5
Saccharomyces Genome Database: the genomics resource of budding yeast.酿酒酵母基因组数据库:芽殖酵母的基因组资源。
Nucleic Acids Res. 2012 Jan;40(Database issue):D700-5. doi: 10.1093/nar/gkr1029. Epub 2011 Nov 21.
6
Computational approaches for detecting protein complexes from protein interaction networks: a survey.从蛋白质相互作用网络中检测蛋白质复合物的计算方法:综述。
BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-11-S1-S3.
7
Finding local communities in protein networks.在蛋白质网络中寻找局部社区。
BMC Bioinformatics. 2009 Sep 18;10:297. doi: 10.1186/1471-2105-10-297.
8
RRW: repeated random walks on genome-scale protein networks for local cluster discovery.RRW:基于全基因组尺度蛋白质网络的重复随机游走用于局部簇发现。
BMC Bioinformatics. 2009 Sep 9;10:283. doi: 10.1186/1471-2105-10-283.
9
A core-attachment based method to detect protein complexes in PPI networks.一种基于核心附着的方法来检测蛋白质-蛋白质相互作用网络中的蛋白质复合物。
BMC Bioinformatics. 2009 Jun 2;10:169. doi: 10.1186/1471-2105-10-169.
10
Complex discovery from weighted PPI networks.基于加权 PPI 网络的复杂发现。
Bioinformatics. 2009 Aug 1;25(15):1891-7. doi: 10.1093/bioinformatics/btp311. Epub 2009 May 12.