Institute of BioMedical Informatics, National Yang-Ming University, Taipei, Taiwan.
Genomics. 2012 Dec;100(6):370-9. doi: 10.1016/j.ygeno.2012.08.001. Epub 2012 Aug 22.
Tandem repetition of domain in protein sequence occurs in all three domains of life. It creates protein diversity and adds functional complexity in organisms. In this work, we analyzed 52 streptococcal genomes and found 3748 proteins contained domain repeats. Proteins not harboring domain repeats are significantly enriched in cytoplasm, whereas proteins with domain repeats are significantly enriched in cytoplasmic membrane, cell wall and extracellular locations. Domain repetition occurs most frequently in S. pneumoniae and least in S. thermophilus and S. pyogenes. DUF1542 is the highest repeated domain in a single protein, followed by Rib, CW_binding_1, G5 and HemolysinCabind. 3D structures of 24 repeat-containing proteins were predicted to investigate the structural and functional effect of domain repetition. Several repeat-containing streptococcal cell surface proteins are known to be virulence-associated. Surface-associated tandem domain-containing proteins without experimental functional characterization may be potentially involved in the pathogenesis of streptococci and deserve further investigation.
蛋白质序列中的串联重复在所有三个生命领域中都存在。它创造了蛋白质多样性,并为生物体增加了功能复杂性。在这项工作中,我们分析了 52 株链球菌基因组,发现 3748 种蛋白质含有结构域重复。没有结构域重复的蛋白质在细胞质中显著富集,而具有结构域重复的蛋白质在细胞质膜、细胞壁和细胞外位置中显著富集。结构域重复在肺炎链球菌中最为常见,在嗜热链球菌和化脓链球菌中则最少。DUF1542 是单个蛋白质中重复最多的结构域,其次是 Rib、CW_binding_1、G5 和 HemolysinCabind。预测了 24 个含有重复结构域的蛋白质的 3D 结构,以研究结构域重复的结构和功能影响。已知几种含有重复结构域的链球菌表面蛋白与毒力有关。尚未进行实验功能表征的表面相关串联结构域蛋白可能与链球菌的发病机制有关,值得进一步研究。