Tian X, Strassmann J E, Queller D C
Department of Biology, Washington University in St Louis, St Louis, MO, USA.
Heredity (Edinb). 2014 Feb;112(2):215-8. doi: 10.1038/hdy.2013.96. Epub 2013 Oct 2.
Eukaryotic protein sequences often contain amino-acid homopolymers that consist of a single amino acid repeated from several to dozens of times. Some of these are functional but others may persist largely because of high expansion rates due to DNA slippage. However, very long homopolymers with over a hundred repeats are very rare. We report an extraordinarily long homopolymer consisting of 306 tandem serine repeats from the single-celled eukaryote Dictyostelium discoideum, which also has a multicellular stage. The gene has a paralog with 132 repeats and orthologs, also with high serine repeat numbers, in various other Dictyostelid species. The conserved gene structure and protein sequences suggest that the homopolymer is functional. The high codon diversity and very poor alignment of serine codons in this gene between species similarly indicate functionality. This is because the serine homopolymer is conserved despite much DNA sequence change. A survey of other very long amino-acid homopolymers in eukaryotes shows that high codon diversity is the rule, suggesting that these too may be functional.
真核生物蛋白质序列常常包含氨基酸同聚物,其由单个氨基酸重复数次到数十次组成。其中一些具有功能,但其他一些可能主要由于DNA滑动导致的高扩展率而留存下来。然而,具有超过一百个重复的非常长的同聚物非常罕见。我们报道了一种来自单细胞真核生物盘基网柄菌(Dictyostelium discoideum)的由306个串联丝氨酸重复组成的异常长的同聚物,盘基网柄菌也有一个多细胞阶段。该基因有一个具有132个重复的旁系同源基因,以及在其他各种盘基网柄菌属物种中同样具有高丝氨酸重复数的直系同源基因。保守的基因结构和蛋白质序列表明该同聚物具有功能。该基因中丝氨酸密码子的高密码子多样性和物种间非常差的比对同样表明其具有功能性。这是因为尽管DNA序列发生了很大变化,丝氨酸同聚物仍然是保守的。对真核生物中其他非常长的氨基酸同聚物的调查表明,高密码子多样性是常态,这表明这些同聚物也可能具有功能。