Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
J Theor Biol. 2013 Jul 7;328:33-42. doi: 10.1016/j.jtbi.2013.03.002. Epub 2013 Mar 15.
Many experiments show that intron loss/gain can influence many stages of mRNA metabolism. However, in the current work, post-spliced introns are not considered directly. Here, the optimal matched regions between introns and their protein coding sequences in ribosomal protein genes are intensively investigated by using the improved Smith-Waterman local alignment software. In introns, it is found that optimal matched regions are located in the central non-conserved regions, and their distribution characteristics are different from each intron group. We find two optimal matched regions in long introns, and the former one is more conservative than the latter one. We also find only one optimal matched region in short introns. In protein coding sequences, there are some optimal matched regions and forbidden regions, especially two conserved forbidden regions located at about 10% and 80% in the length of protein coding sequences. The forbidden regions may be potential protein-binding regions. Match rates of most optimal matched segments range among 65% and 75% and they belonged to weak match. The interaction between post-spliced introns and corresponding protein coding sequences may play a key role in gene expression.
许多实验表明,内含子的丢失/获得会影响 mRNA 代谢的许多阶段。然而,在目前的工作中,没有直接考虑拼接后的内含子。在这里,通过使用改进的 Smith-Waterman 局部比对软件,深入研究了核糖体蛋白基因中外显子和其蛋白质编码序列之间的最优匹配区域。在内含子中,发现最优匹配区域位于中央非保守区域,其分布特征与每个内含子组不同。我们在长内含子中发现了两个最优匹配区域,前一个比后一个更保守。我们还发现短内含子中只有一个最优匹配区域。在蛋白质编码序列中,存在一些最优匹配区域和禁止区域,特别是位于蛋白质编码序列长度约 10%和 80%处的两个保守禁止区域。这些禁止区域可能是潜在的蛋白质结合区域。大多数最优匹配片段的匹配率在 65%到 75%之间,属于弱匹配。拼接后的内含子和相应的蛋白质编码序列之间的相互作用可能在基因表达中起着关键作用。