Barrai I, Scapoli C, Barale R, Volinia S
Institute of Zoology, University of Ferrara, Italy.
Nucleic Acids Res. 1990 May 25;18(10):3021-5. doi: 10.1093/nar/18.10.3021.
The frequencies of oligonucleotides of length 3-6 were studied in 211 sequences of human DNA (659 kilobases), 22 sequences of DNA of human viruses (120 kbs), in 181 sequences of E. coli (442 kbs), and in 42 sequences of phages of E. coli (137 kbs). The sequences were obtained from Genbank(R) 48. The observed frequencies (O) were compared to the expected frequencies (E) obtained in two ways: 1) according to nucleotide composition for each series, and 2) according to first order Markow chains for triplets, second order for quadruplets, and third order for quintuplets and sextuplets. The ratio O/E was obtained for each oligonucleotide. Then, the correlation between the ratio O/E in a pair of series was calculated. Strong correlations were observed for sequences of man and human viruses, and for E. coli and its phages. Other correlations were small. For higher order Markov chains, there is indication of some correlation also between viruses and phages. It was concluded that through analysis of parallel oligonucleotide series it may be possible to infer some of the complex evolutionary relationships existing between cells and their infectors beyond the level of codon usage.
对人类DNA的211个序列(659千碱基)、人类病毒DNA的22个序列(120千碱基)、大肠杆菌的181个序列(442千碱基)以及大肠杆菌噬菌体的42个序列(137千碱基)中长度为3至6的寡核苷酸频率进行了研究。这些序列取自Genbank(R) 48。将观察到的频率(O)与通过两种方式获得的预期频率(E)进行比较:1)根据每个系列的核苷酸组成;2)根据三联体的一阶马尔可夫链、四联体的二阶马尔可夫链、五联体和六联体的三阶马尔可夫链。得出每个寡核苷酸的O/E比值。然后,计算一对系列中O/E比值之间的相关性。在人类和人类病毒序列之间以及大肠杆菌及其噬菌体之间观察到强相关性。其他相关性较小。对于高阶马尔可夫链,病毒和噬菌体之间也有一些相关性的迹象。得出的结论是,通过对平行寡核苷酸系列的分析,有可能推断出细胞与其感染源之间存在的一些超出密码子使用水平的复杂进化关系。