Rodriguez Yacidzohara, Gonzalez-Mendez Ricardo R, Cadilla Carmen L
Department of Biochemistry, School of Medicine, University of Puerto Rico, San Juan, Puerto Rico, United States of America.
Department of Radiological Sciences, School of Medicine, University of Puerto Rico, San Juan, Puerto Rico, United States of America.
PLoS One. 2016 Aug 24;11(8):e0161029. doi: 10.1371/journal.pone.0161029. eCollection 2016.
Twist proteins belong to the basic helix-loop-helix (bHLH) family of multifunctional transcriptional factors. These factors are known to use domains other than the common bHLH in protein-protein interactions. There has been much work characterizing the bHLH domain and the C-terminus in protein-protein interactions but despite a few attempts more focus is needed at the N-terminus. Since the region of highest diversity in Twist proteins is the N-terminus, we analyzed the conservation of this region in different vertebrate Twist proteins and study the sequence differences between Twist1 and Twist2 with emphasis on the glycine-rich regions found in Twist1. We found a highly conserved sequence motif in all Twist1 (SSSPVSPADDSLSNSEEE) and Twist2 (SSSPVSPVDSLGTSEEE) mammalian species with unknown function. Through sequence comparison we demonstrate that the Twist protein family ancestor was "Twist2-like" and the two glycine-rich regions found in Twist1 sequences were acquired late in evolution, apparently not at the same time. The second glycine-rich region started developing first in the fish vertebrate group, while the first glycine region arose afterwards within the reptiles. Disordered domain and secondary structure predictions showed that the amino acid sequence and disorder feature found at the N-terminus is highly evolutionary conserved and could be a functional site that interacts with other proteins. Detailed examination of the glycine-rich regions in the N-terminus of Twist1 demonstrate that the first region is completely aliphatic while the second region contains some polar residues that could be subject to post-translational modification. Phylogenetic and sequence space analysis showed that the Twist1 subfamily is the result of a gene duplication during Twist2 vertebrate fish evolution, and has undergone more evolutionary drift than Twist2. We identified a new signature motif that is characteristic of each Twist paralog and identified important residues within this motif that can be used to distinguish between these two paralogs, which will help reduce Twist1 and Twist2 sequence annotation errors in public databases.
Twist蛋白属于多功能转录因子的碱性螺旋-环-螺旋(bHLH)家族。已知这些因子在蛋白质-蛋白质相互作用中会使用除常见bHLH之外的结构域。在蛋白质-蛋白质相互作用中,已有许多工作对bHLH结构域和C末端进行了表征,但尽管进行了一些尝试,但仍需要更多地关注N末端。由于Twist蛋白中多样性最高的区域是N末端,我们分析了该区域在不同脊椎动物Twist蛋白中的保守性,并研究了Twist1和Twist2之间的序列差异,重点关注Twist1中发现的富含甘氨酸的区域。我们在所有Twist1(SSSPVSPADDSLSNSEEE)和Twist2(SSSPVSPVDSLGTSEEE)哺乳动物物种中发现了一个功能未知的高度保守序列基序。通过序列比较,我们证明Twist蛋白家族的祖先类似于“Twist2”,Twist1序列中发现的两个富含甘氨酸的区域是在进化后期获得的,显然不是同时获得的。第二个富含甘氨酸的区域首先在鱼类脊椎动物群体中开始发展,而第一个甘氨酸区域随后在爬行动物中出现。无序结构域和二级结构预测表明,N末端的氨基酸序列和无序特征在进化上高度保守,可能是与其他蛋白质相互作用的功能位点。对Twist1 N末端富含甘氨酸区域的详细检查表明,第一个区域完全是脂肪族的,而第二个区域包含一些可能进行翻译后修饰的极性残基。系统发育和序列空间分析表明,Twist1亚家族是Twist2在脊椎动物鱼类进化过程中基因复制的结果,并且比Twist2经历了更多的进化漂移。我们确定了一个新的特征基序,它是每个Twist旁系同源物的特征,并确定了该基序内可用于区分这两个旁系同源物的重要残基,这将有助于减少公共数据库中Twist1和Twist2序列注释错误。