Karlin S, Ghandour G
Proc Natl Acad Sci U S A. 1985 Sep;82(18):6186-90. doi: 10.1073/pnas.82.18.6186.
Concepts and methods [Karlin, S. & Ghandour, G. (1985) Proc. Natl. Acad. Sci. USA 82, 5800-5804] for the analysis of patterns and relationships are extended to multiple DNA and protein sequences. Functionals include multiple sequence common word occurrence distributions, characterizations of high frequency shared words, and ascertainment of long block identities. Various comparisons of sequences using natural alphabets obtained from grouping nucleotides or amino acids by their chemical and functional characteristics are described. Specific applications are given to globin genes, mitochondrial genomes, and a variety of mammalian viruses.
用于分析模式和关系的概念与方法[卡林,S. 及甘杜尔,G.(1985年)《美国国家科学院院刊》82卷,5800 - 5804页]被扩展至多个DNA和蛋白质序列。泛函包括多个序列的常见词出现分布、高频共享词的特征描述以及长片段同一性的确证。描述了使用通过根据核苷酸或氨基酸的化学和功能特征进行分组而获得的自然字母表对序列进行的各种比较。给出了针对珠蛋白基因、线粒体基因组以及多种哺乳动物病毒的具体应用。