Barik Sailen
EonBio, 3780 Pelham Drive, Mobile, AL 36619, USA.
Heliyon. 2017 Dec 28;3(12):e00492. doi: 10.1016/j.heliyon.2017.e00492. eCollection 2017 Dec.
A significant number of proteins in all living species contains amino acid repeats (AARs) of various lengths and compositions, many of which play important roles in protein structure and function. Here, I have surveyed select homopolymeric single [(A)n] and double [(AB)n] AARs in the human proteome. A close examination of their codon pattern and analysis of RNA structure propensity led to the following set of empirical rules: (1) One class of amino acid repeats (Class I) uses a mixture of synonymous codons, some of which approximate the codon bias ratio in the overall human proteome; (2) The second class (Class II) disregards the codon bias ratio, and appears to have originated by simple repetition of the same codon (or just a few codons); and finally, (3) In all AARs (including Class I, Class II, and the in-betweens), the codons are chosen in a manner that precludes the formation of RNA secondary structure. It appears that the AAR genes have evolved by orchestrating a balance between codon usage and mRNA secondary structure. The insights gained here should provide a better understanding of AAR evolution and may assist in designing synthetic genes.
所有生物物种中都有相当数量的蛋白质包含各种长度和组成的氨基酸重复序列(AAR),其中许多在蛋白质结构和功能中发挥着重要作用。在这里,我研究了人类蛋白质组中选定的同聚物单链[(A)n]和双链[(AB)n] AAR。对它们的密码子模式进行仔细检查并分析RNA结构倾向,得出了以下一组经验规则:(1)一类氨基酸重复序列(I类)使用同义密码子的混合物,其中一些接近整个人类蛋白质组中的密码子偏好比率;(2)第二类(II类)无视密码子偏好比率,似乎是通过相同密码子(或仅几个密码子)的简单重复而产生的;最后,(3)在所有AAR中(包括I类、II类以及介于两者之间的类型),密码子的选择方式可防止RNA二级结构的形成。看来,AAR基因是通过协调密码子使用和mRNA二级结构之间的平衡而进化的。这里获得的见解应该能更好地理解AAR的进化,并可能有助于设计合成基因。