Gelfand M S, Podolsky L I, Astakhova T V, Roytberg M A
Institute of Protein Research, Russian Academy of Sciences, Moscow Region, Russia.
J Comput Biol. 1996 Summer;3(2):223-34. doi: 10.1089/cmb.1996.3.223.
A new approach to computer-assisted gene recognition in higher eukaryote DNA is suggested. It allows one to use not only linear functions for scoring structures, but all functions satisfying natural monotonicity conditions. The algorithm constructs the set of structures guaranteed to contain an optimal structure for every function. So, it uncouples the time-consuming step of generation of this set from the fast step of structure scoring, thus making it simple to experiment with different functions. One particular scoring function, taking into account only codon usage and positional nucleotide frequencies of the splicing sites, has been implemented in the Genome Recognition and Exon Assembly Tool program, and has been tested on an independent sample of human genes, yielding 88% sensitivity and 79% specificity.