Allen Rondine J, Brenner Evan P, VanOrsdel Caitlin E, Hobson Jessica J, Hearn David J, Hemm Matthew R
Department of Biological Sciences, Towson University, Towson 21252MD, USA.
BMC Genomics. 2014 Dec 5;15(1):946. doi: 10.1186/1471-2164-15-946.
The reliable identification of proteins containing 50 or fewer amino acids is difficult due to the limited information content in short sequences. The 37 amino acid CydX protein in Escherichia coli is a member of the cytochrome bd oxidase complex, an enzyme found throughout Eubacteria. To investigate the extent of CydX conservation and prevalence and evaluate different methods of small protein homologue identification, we surveyed 1095 Eubacteria species for the presence of the small protein.
Over 300 homologues were identified, including 80 unannotated genes. The ability of both closely-related and divergent homologues to complement the E. coli ΔcydX mutant supports our identification techniques, and suggests that CydX homologues retain similar function among divergent species. However, sequence analysis of these proteins shows a great degree of variability, with only a few highly-conserved residues. An analysis of the co-variation between CydX homologues and their corresponding cydA and cydB genes shows a close synteny of the small protein with the CydA long Q-loop. Phylogenetic analysis suggests that the cydABX operon has undergone horizontal gene transfer, although the cydX gene likely evolved in a progenitor of the Alpha, Beta, and Gammaproteobacteria. Further investigation of cydAB operons identified two additional conserved hypothetical small proteins: CydY encoded in CydAQlong operons that lack cydX, and CydZ encoded in more than 150 CydAQshort operons.
This study provides a systematic analysis of bioinformatics techniques required for the unique challenges present in small protein identification and phylogenetic analyses. These results elucidate the prevalence of CydX throughout the Proteobacteria, provide insight into the selection pressure and sequence requirements for CydX function, and suggest a potential functional interaction between the small protein and the CydA Q-loop, an enigmatic domain of the cytochrome bd oxidase complex. Finally, these results identify other conserved small proteins encoded in cytochrome bd oxidase operons, suggesting that small protein subunits may be a more common component of these enzymes than previously thought.
由于短序列中的信息含量有限,可靠鉴定含50个或更少氨基酸的蛋白质很困难。大肠杆菌中37个氨基酸的CydX蛋白是细胞色素bd氧化酶复合物的成员,该酶在真细菌中广泛存在。为了研究CydX的保守程度和普遍性,并评估鉴定小蛋白同源物的不同方法,我们调查了1095种真细菌物种中该小蛋白的存在情况。
鉴定出300多个同源物,包括80个未注释基因。亲缘关系近和远的同源物互补大肠杆菌ΔcydX突变体的能力支持了我们的鉴定技术,并表明CydX同源物在不同物种间保留了相似功能。然而,这些蛋白质的序列分析显示出很大程度的变异性,只有少数高度保守的残基。对CydX同源物与其相应的cydA和cydB基因之间的共变分析表明,该小蛋白与CydA的长Q环具有紧密的共线性。系统发育分析表明,cydABX操纵子经历了水平基因转移,尽管cydX基因可能在α、β和γ变形菌的祖先中进化。对cydAB操纵子的进一步研究确定了另外两个保守的假定小蛋白:在缺乏cydX的CydAQ长操纵子中编码的CydY,以及在150多个CydAQ短操纵子中编码的CydZ。
本研究对小蛋白鉴定和系统发育分析中存在的独特挑战所需的生物信息学技术进行了系统分析。这些结果阐明了CydX在变形菌中的普遍性,深入了解了CydX功能的选择压力和序列要求,并表明该小蛋白与细胞色素bd氧化酶复合物的神秘结构域CydA Q环之间存在潜在的功能相互作用。最后,这些结果鉴定了细胞色素bd氧化酶操纵子中编码的其他保守小蛋白,表明小蛋白亚基可能是这些酶比以前认为的更常见的组成部分。