Arquès D G, Michel C J
Equipe de Biologie Théorique, Université de Franche-Comté, Laboratoire d'Informatique de Besançon, France.
Biochimie. 1993;75(5):399-407. doi: 10.1016/0300-9084(93)90173-p.
The nucleotide distribution in protein coding genes, introns and transfer RNA genes of eukaryotic subpopulations (primates, rodent and mammals) is studied by autocorrelation functions. The autocorrelation function analysing the occurrence probability of the i-motif YRY(N)iYRY (YRY-function) in protein coding genes and transfer RNA genes of these three eukaryotic subpopulations retrieves the preferential occurrence of YRY(N)6YRY (R = purine = adenine or guanine, Y = pyrimidine = cytosine or thymine, N = R or Y). The autocorrelation functions analysing the occurrence probability of the i-motifs RRR(N)iRRR (RRR-function) and YYY(N)iYYY (YYY-function) identify new non-random genetic statistical properties in these three eukaryotic subpopulations, mainly: i) in their protein coding genes: local maxima for i identical to 6 [12] (peaks for i = 6, 18, 30, 42) with the RRR-function and local maxima for i identical to 8 [10] (peaks for i = 8, 18, 28) with the YYY-function; and ii) in their introns: local maxima for i identical to 3 [6] (peaks for i = 3, 9, 15) and a short linear decrease followed by a large exponential decrease both with the RRR- and YYY-functions. The non-random properties identified in eukaryotic intron subpopulations are modelised with a process of random insertions and deletions of nucleotides simulating the RNA editing.