Department of Mathematics, Faculty of Science, University of Qom, Qom, IR, Iran.
Faculty of Computer Science, Insitute for Mathematical Biology, Mannheim University of Applied Science, Mannheim, Germany.
Theory Biosci. 2021 Feb;140(1):107-121. doi: 10.1007/s12064-020-00337-z. Epub 2021 Feb 1.
In the 1950s, Crick proposed the concept of so-called comma-free codes as an answer to the frame-shift problem that biologists have encountered when studying the process of translating a sequence of nucleotide bases into a protein. A little later it turned out that this proposal unfortunately does not correspond to biological reality. However, in the mid-90s, a weaker version of comma-free codes, so-called circular codes, was discovered in nature in J Theor Biol 182:45-58, 1996. Circular codes allow to retrieve the reading frame during the translational process in the ribosome and surprisingly the circular code discovered in nature is even circular in all three possible reading-frames ([Formula: see text]-property). Moreover, it is maximal in the sense that it contains 20 codons and is self-complementary which means that it consists of pairs of codons and corresponding anticodons. In further investigations, it was found that there are exactly 216 codes that have the same strong properties as the originally found code from J Theor Biol 182:45-58. Using an algebraic approach, it was shown in J Math Biol, 2004 that the class of 216 maximal self-complementary [Formula: see text]-codes can be partitioned into 27 equally sized equivalence classes by the action of a transformation group [Formula: see text] which is isomorphic to the dihedral group. Here, we extend the above findings to circular codes over a finite alphabet of even cardinality [Formula: see text] for [Formula: see text]. We describe the corresponding group [Formula: see text] using matrices and we investigate what classes of circular codes are split into equally sized equivalence classes under the natural equivalence relation induced by [Formula: see text]. Surprisingly, this is not always the case. All results and constructions are illustrated by examples.
在 20 世纪 50 年代,克里克提出了所谓无逗号编码的概念,作为解决生物学家在研究将核苷酸序列翻译成蛋白质的过程中遇到的移码问题的一种方法。后来,人们发现这个提议不幸地不符合生物学现实。然而,在 90 年代中期,在自然中发现了一种较弱的无逗号编码,即所谓的循环编码,发表于 J Theor Biol 182:45-58, 1996。循环编码允许在核糖体的翻译过程中恢复阅读框,令人惊讶的是,在自然界中发现的循环码甚至在所有三个可能的阅读框中都是循环的([Formula: see text]-property)。此外,它是最大的,因为它包含 20 个密码子,并且是自互补的,这意味着它由密码子和相应的反密码子对组成。在进一步的研究中,发现有 216 个代码具有与最初在 J Theor Biol 182:45-58 中发现的代码相同的强性质。使用代数方法,在 J Math Biol, 2004 中表明,216 个最大自互补[Formula: see text]-码的类可以通过一个变换群[Formula: see text]的作用划分为 27 个大小相等的等价类,该群与二面体群同构。在这里,我们将上述发现扩展到有限字母表[Formula: see text]上的循环码,对于[Formula: see text]。我们使用矩阵描述相应的群[Formula: see text],并研究在自然等价关系下,哪些类的循环码被划分为大小相等的等价类。令人惊讶的是,情况并非总是如此。所有的结果和构造都用例子来说明。