Sobolevsky Yehoshua, Trifonov Edward N
Institute of Mathematics, Hebrew University of Jerusalem, Jerusalem 91904, Israel.
J Mol Evol. 2005 Nov;61(5):591-6. doi: 10.1007/s00239-004-0256-8. Epub 2005 Oct 4.
A full repertoire of octapeptides which are present in at least 30 bacterial proteomes of total 131 currently available is computationally derived and filtered. An original search technique is used that, in terms of computational time and memory, is similar to the Suffix tree method. The presence of a given sequence in a large number of proteomes qualifies it as a conserved sequence. The larger the number of proteomes where it is found, the higher is the conservation. The concept of compositional age of the amino acid sequences ("compositional clock") is introduced for the first time. The compositional age is calculated on the basis of the consensus temporal order of appearance of amino acids in early evolution. The correlation between the compositional age and the sequence conservation is established.