用于蛋白质复合物和蛋白质相互作用网络组装的聚类算法评估。

Evaluation of clustering algorithms for protein complex and protein interaction network assembly.

作者信息

Sardiu Mihaela E, Florens Laurence, Washburn Michael P

机构信息

Stowers Institute for Medical Research, Kansas City, Missouri 64110, USA.

出版信息

J Proteome Res. 2009 Jun;8(6):2944-52. doi: 10.1021/pr900073d.

DOI:10.1021/pr900073d

PMID:19317493

Abstract

Assembling protein complexes and protein interaction networks from affinity purification-based proteomics data sets remains a challenge. When little a priori knowledge of the complexes exists, it is difficult to place proteins in the proper locations and evaluate the results of clustering approaches. Here we have systematically compared multiple hierarchical and partitioning clustering approaches using a well-characterized but highly complex human protein interaction network data set centered around the conserved AAA+ ATPases Tip49a and Tip49b. This network provides a challenge to clustering algorithms because Tip49a and Tip49b are present in four distinct complexes, the network contains modules, and the network has multiple attachments. We compared the use of binary data, quantitative proteomics data in the form of normalized spectral abundance factors, and the Z-score normalization. In our analysis, a partitioning approach indicated the major modules in a network. Next, while Euclidian distance was sensitive to scaling, with data transformation, all the attachments in a data set were recovered in one branch of a dendrogram. Finally, when Pearson correlation and hierarchical clustering were used, complexes were well separated and their attachments were placed in the proper locations. Each of these three approaches provided distinct information useful for assembly of a network of multiple protein complexes.

摘要

从基于亲和纯化的蛋白质组学数据集中组装蛋白质复合物和蛋白质相互作用网络仍然是一项挑战。当对复合物的先验知识很少时，很难将蛋白质放置在合适的位置并评估聚类方法的结果。在这里，我们使用了一个以保守的AAA+ATP酶Tip49a和Tip49b为中心的特征明确但高度复杂的人类蛋白质相互作用网络数据集，系统地比较了多种层次聚类和划分聚类方法。这个网络对聚类算法提出了挑战，因为Tip49a和Tip49b存在于四个不同的复合物中，网络包含模块，并且网络有多个附属物。我们比较了二进制数据、以标准化光谱丰度因子形式的定量蛋白质组学数据以及Z分数标准化的使用情况。在我们的分析中，一种划分方法指出了网络中的主要模块。接下来，虽然欧几里得距离对缩放敏感，但通过数据转换，数据集中的所有附属物都在树状图的一个分支中被恢复。最后，当使用皮尔逊相关和层次聚类时，复合物被很好地分离，并且它们的附属物被放置在合适的位置。这三种方法中的每一种都提供了对组装多个蛋白质复合物网络有用的独特信息。

相似文献

Evaluation of clustering algorithms for protein complex and protein interaction network assembly.

J Proteome Res. 2009 Jun;8(6):2944-52. doi: 10.1021/pr900073d.

Homogeneous decomposition of protein interaction networks: refining the description of intra-modular interactions.

Bioinformatics. 2009 Apr 1;25(7):926-32. doi: 10.1093/bioinformatics/btp083. Epub 2009 Feb 17.

A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality.

BMC Bioinformatics. 2007 Jul 2;8:236. doi: 10.1186/1471-2105-8-236.

Identification of functional modules in a PPI network by clique percolation clustering.

Comput Biol Chem. 2006 Dec;30(6):445-51. doi: 10.1016/j.compbiolchem.2006.10.001. Epub 2006 Nov 13.

A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification.

Comput Biol Chem. 2011 Oct 12;35(5):298-307. doi: 10.1016/j.compbiolchem.2011.07.005. Epub 2011 Jul 20.

Protein complex prediction via cost-based clustering.

Bioinformatics. 2004 Nov 22;20(17):3013-20. doi: 10.1093/bioinformatics/bth351. Epub 2004 Jun 4.

Integrated analysis of multiple data sources reveals modular structure of biological networks.

Biochem Biophys Res Commun. 2006 Jun 23;345(1):302-9. doi: 10.1016/j.bbrc.2006.04.088. Epub 2006 Apr 27.

A unified representation of multiprotein complex data for modeling interaction networks.

Proteins. 2004 Oct 1;57(1):99-108. doi: 10.1002/prot.20147.

Protein complex prediction based on simultaneous protein interaction network.

Bioinformatics. 2010 Feb 1;26(3):385-91. doi: 10.1093/bioinformatics/btp668. Epub 2009 Dec 4.

An ensemble framework for clustering protein-protein interaction networks.

Bioinformatics. 2007 Jul 1;23(13):i29-40. doi: 10.1093/bioinformatics/btm212.

引用本文的文献

Comparative- and network-based proteomic analysis of bacterial chondronecrosis with osteomyelitis lesions in broiler's proximal tibiae identifies new molecular signatures of lameness.

Sci Rep. 2023 Apr 12;13(1):5947. doi: 10.1038/s41598-023-33060-y.

SARS-CoV-2 Infection Induces Psoriatic Arthritis Flares and Enthesis Resident Plasmacytoid Dendritic Cell Type-1 Interferon Inhibition by JAK Antagonism Offer Novel Spondyloarthritis Pathogenesis Insights.

Front Immunol. 2021 Apr 15;12:635018. doi: 10.3389/fimmu.2021.635018. eCollection 2021.

SFPQ and Tau: critical factors contributing to rapid progression of Alzheimer's disease.

Acta Neuropathol. 2020 Sep;140(3):317-339. doi: 10.1007/s00401-020-02178-y. Epub 2020 Jun 23.

Generating topological protein interaction scores and data visualization with TopS.

Methods. 2020 Dec 1;184:13-18. doi: 10.1016/j.ymeth.2019.08.010. Epub 2019 Aug 30.

Comparative and network-based proteomic analysis of low dose ethanol- and lipopolysaccharide-induced macrophages.

PLoS One. 2018 Feb 26;13(2):e0193104. doi: 10.1371/journal.pone.0193104. eCollection 2018.

Capturing protein communities by structural proteomics in a thermophilic eukaryote.

Mol Syst Biol. 2017 Jul 25;13(7):936. doi: 10.15252/msb.20167412.

Identification of Topological Network Modules in Perturbed Protein Interaction Networks.

Sci Rep. 2017 Mar 8;7:43845. doi: 10.1038/srep43845.

Affinity purification-mass spectrometry and network analysis to understand protein-protein interactions.

Nat Protoc. 2014 Nov;9(11):2539-54. doi: 10.1038/nprot.2014.164. Epub 2014 Oct 2.

Inferring protein-protein interaction complexes from immunoprecipitation data.

BMC Res Notes. 2013 Nov 15;6:468. doi: 10.1186/1756-0500-6-468.

A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis.

BMC Bioinformatics. 2013 Oct 4;14:299. doi: 10.1186/1471-2105-14-299.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于蛋白质复合物和蛋白质相互作用网络组装的聚类算法评估。

Evaluation of clustering algorithms for protein complex and protein interaction network assembly.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献