CNR-IMM, Sezione di Bologna, Via Gobetti 101, I-40129 Bologna, Italy.
J Theor Biol. 2011 Apr 21;275(1):21-8. doi: 10.1016/j.jtbi.2011.01.028. Epub 2011 Jan 26.
In 1996 Arquès and Michel [1996. A complementary circular code in the protein coding genes. J. Theor. Biol. 182, 45-58] discovered the existence of a common circular code in eukaryote and prokaryote genomes. Since then, circular code theory has provoked great interest and underwent a rapid development. In this paper we discuss some theoretical issues related to the synchronization properties of coding sequences and circular codes with particular emphasis on the problem of retrieval and maintenance of the reading frame. Motivated by the theoretical discussion, we adopt a rigorous statistical approach in order to try to answer different questions. First, we investigate the covering capability of the whole class of 216 self-complementary, C(3) maximal codes with respect to a large set of coding sequences. The results indicate that, on average, the code proposed by Arquès and Michel has the best covering capability but, still, there exists a great variability among sequences. Second, we focus on such code and explore the role played by the proportion of the bases by means of a hierarchy of permutation tests. The results show the existence of a sort of optimization mechanism such that coding sequences are tailored as to maximize or minimize the coverage of circular codes on specific reading frames. Such optimization clearly relates the function of circular codes with reading frame synchronization.
1996 年,Arquès 和 Michel [1996. 蛋白质编码基因中的互补环状密码。J. 理论生物学 182, 45-58] 发现真核生物和原核生物基因组中存在共同的环状密码。从那时起,环状密码理论引起了极大的兴趣,并经历了快速发展。在本文中,我们讨论了与编码序列和环状密码同步特性相关的一些理论问题,特别强调了阅读框的检索和维护问题。受理论讨论的启发,我们采用了严格的统计方法,试图回答不同的问题。首先,我们调查了整个 216 个自互补的 C(3)最大码类对一大组编码序列的覆盖能力。结果表明,平均而言,Arquès 和 Michel 提出的代码具有最佳的覆盖能力,但序列之间仍然存在很大的可变性。其次,我们专注于这种代码,并通过一系列排列检验探索比例的作用。结果表明存在一种优化机制,使编码序列能够针对特定阅读框架最大化或最小化环状代码的覆盖范围。这种优化机制明确地将环状代码的功能与阅读框同步联系起来。