Dumas J P, Ninio J
Nucleic Acids Res. 1982 Jan 11;10(1):197-206. doi: 10.1093/nar/10.1.197.
Fast algorithms for analysing sequence data are presented. An algorithm for strict homologies finds all common subsequences of length greater than or equal to 6 in two given sequences. With it, nucleic acid pieces five thousand nucleotides long can be compared in five seconds on CDC 6600. Secondary structure algorithms generate the N most stable secondary structures of an RNA molecule, taking into account all loop contributions, and the formation of all possible base-pairs in stems, including odd pairs (G.G., C.U., etc.). They allow a typical 100-nucleotide sequence to be analysed in 10 seconds. The homology and secondary structure programs are respectively illustrated with a comparison of two phage genomes, and a discussion of Drosophila melanogaster 55 RNA folding.
本文介绍了用于分析序列数据的快速算法。一种用于严格同源性分析的算法可在两个给定序列中找到所有长度大于或等于6的公共子序列。利用该算法,在CDC 6600计算机上,长度为五千个核苷酸的核酸片段可在五秒内完成比较。二级结构算法可生成RNA分子最稳定的N种二级结构,该算法考虑了所有环的贡献以及茎中所有可能碱基对的形成,包括奇数对(G.G.、C.U.等)。它们能在10秒内分析一个典型的100个核苷酸的序列。通过比较两个噬菌体基因组以及讨论黑腹果蝇55 RNA折叠,分别展示了同源性和二级结构程序。