Liang Chunguang, Schaack Dominik, Srivastava Mugdha, Gupta Shishir K, Sarukhanyan Edita, Giese Anne, Pagels Martin, Romanov Natalie, Pané-Farré Jan, Fuchs Stephan, Dandekar Thomas
Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany.
Institut für Mikrobiologie Ernst-Moritz-Arndt-Universität Greifswald, Friedrich-Ludwig-Jahn-Straße 15, D-17487 Greifswald, Germany.
Proteomes. 2016 Feb 19;4(1):8. doi: 10.3390/proteomes4010008.
is an important model organism and pathogen. This proteome overview details shared and specific proteins and selected virulence-relevant protein complexes from representative strains of all three major clades. To determine the strain distribution and major clades we used a refined strain comparison combining ribosomal RNA, MLST markers, and looking at highly-conserved regions shared between strains. This analysis shows three sub-clades (A-C) for . As calculations are complex and strain annotation is quite time consuming we compare here key representatives of each clade with each other: model strains COL, USA300, Newman, and HG001 (clade A), model strain N315 and Mu50 (clade B) and ED133 and MRSA252 (clade C). We look at these individual proteomes and compare them to a background of 64 strains. There are overall 13,284 proteins not part of the core proteome which are involved in different strain-specific or more general complexes requiring detailed annotation and new experimental data to be accurately delineated. By comparison of the eight representative strains, we identify strain-specific proteins (e.g., 18 in COL, 105 in N315 and 44 in Newman) that characterize each strain and analyze pathogenicity islands if they contain such strain-specific proteins. We identify strain-specific protein repertoires involved in virulence, in cell wall metabolism, and phosphorylation. Finally we compare and analyze protein complexes conserved and well-characterized among (a total of 103 complexes), as well as predict and analyze several individual protein complexes, including structure modeling in the three clades.
是一种重要的模式生物和病原体。本蛋白质组概述详细介绍了来自所有三个主要进化枝代表性菌株的共享和特定蛋白质以及选定的与毒力相关的蛋白质复合物。为了确定菌株分布和主要进化枝,我们使用了一种精细的菌株比较方法,结合核糖体RNA、多位点序列分型标记,并观察菌株之间共享的高度保守区域。该分析显示了三个亚进化枝(A - C)。由于计算复杂且菌株注释相当耗时,我们在此比较每个进化枝的关键代表菌株:模型菌株COL、USA300、Newman和HG001(进化枝A),模型菌株N315和Mu50(进化枝B)以及ED133和MRSA252(进化枝C)。我们研究这些个体蛋白质组,并将它们与64个菌株的背景进行比较。总共有13284种蛋白质不属于核心蛋白质组,它们参与不同的菌株特异性或更一般的复合物,需要详细注释和新的实验数据才能准确描述。通过比较这八个代表性菌株,我们鉴定出表征每个菌株的菌株特异性蛋白质(例如,COL中有18种,N315中有105种,Newman中有44种),并分析致病岛是否包含此类菌株特异性蛋白质。我们鉴定出参与毒力、细胞壁代谢和磷酸化的菌株特异性蛋白质库。最后,我们比较和分析了在……中保守且特征明确的蛋白质复合物(总共103个复合物),以及预测和分析了几个单独的蛋白质复合物,包括三个进化枝中的结构建模。