Ahmed Ahmed, Frey Gabriel, Michel Christian J
Equipe de Bioinformatique Théorique, LSIIT (UMR CNRS-ULP 7005), Université Louis Pasteur de Strasbourg, Pô1e API, Boulevard Sébastien Brant, 67400 Illkirch, France.
In Silico Biol. 2007;7(2):155-68.
Three sets of 20 trinucleotides are preferentially associated with the reading frames and their 2 shifted frames of both eukaryotic and prokaryotic genes. These 3 sets are circular codes. They allow retrieval of any frame in genes (containing these circular code words), locally anywhere in the 3 frames and in particular without start codons in the reading frame, and automatically with the reading of a few nucleotides. The circular code in the reading frame, noted X, which can deduce the 2 other circular codes in the shifted frames by permutation, is the information used for analysing frameshift genes, i. e. genes with a change of reading frame during translation. This work studies the circular code signal around their frameshift sites. Two scoring methods are developed, a function P based on this code X and a function Q based both on this code X and the 4 trinucleotides with identical nucleotides. They detect a significant correlation between the code X and the -1 frameshift signals in both eukaryotic and prokaryotic genes, and the +1 frameshift signals in eukaryotic genes.
三组各20个三核苷酸优先与真核生物和原核生物基因的阅读框及其两个移码阅读框相关联。这三组是循环码。它们允许检索基因中的任何阅读框(包含这些循环码词),在三个阅读框中的任何位置局部检索,特别是在阅读框中没有起始密码子的情况下,并且通过读取几个核苷酸自动检索。阅读框中的循环码,记为X,通过置换可以推导出移码阅读框中的另外两个循环码,它是用于分析移码基因的信息,即翻译过程中阅读框发生变化的基因。这项工作研究了移码位点周围的循环码信号。开发了两种评分方法,一种基于此码X的函数P和一种基于此码X以及具有相同核苷酸的4个三核苷酸的函数Q。它们检测到在真核生物和原核生物基因中,码X与-1移码信号之间以及在真核生物基因中码X与+1移码信号之间存在显著相关性。