对于各种计算机生成的模型系统，一种不需要序列比对的差异度量的平均值是需要序列比对的传统错配计数平均值的两倍。

Average values of a dissimilarity measure not requiring sequence alignment are twice the averages of conventional mismatch counts requiring sequence alignment for a variety of computer-generated model systems.

作者信息

Blaisdell B E

机构信息

Department of Mathematics, Stanford University, CA 94305.

出版信息

J Mol Evol. 1991 Jun;32(6):521-8. doi: 10.1007/BF02102654.

DOI:10.1007/BF02102654

PMID:1908023

Abstract

A measure of sequence similarity, dt, not requiring prior sequence alignment gave correct results for a variety of computer-generated model sequences without and with gaps for all degrees of substitution, s. Measure d was the squared Euclidean distance between vectors of counts of t-tuplets of characters in the two sequences. In models without gaps and without Needleman-Wunsch alignment, average d was very closely equal to twice average conventional mismatch counts, m. In these models one of each of the conditions on the Jukes-Cantor model was violated in turn: (1) both descendant lineages receive the same number of substitutions, (2) all sites are equally likely to be substituted, (3) all different replacement characters are equally likely to be chosen, and (4) all original characters are equally likely to be substituted. In Jukes-Cantor models with gaps Needleman-Wunsch alignment was necessarily performed, a procedure that generally produced incorrect values of m. For these models average d was found to be very closely equal to twice the average m estimated from the known value of s using the inverted Jukes-Cantor formula.

摘要

一种序列相似性度量dt，无需事先进行序列比对，对于各种计算机生成的模型序列，无论有无空位，在所有替换程度s下都能给出正确结果。度量d是两个序列中字符t联体计数向量之间的欧几里得距离平方。在没有空位且没有Needleman-Wunsch比对的模型中，平均d非常接近于平均传统错配计数m的两倍。在这些模型中，Jukes-Cantor模型的每个条件依次被违反：(1) 两个后代谱系接受相同数量的替换；(2) 所有位点被替换的可能性相同；(3) 所有不同的替换字符被选择的可能性相同；(4) 所有原始字符被替换的可能性相同。在有空位的Jukes-Cantor模型中，必须进行Needleman-Wunsch比对，该过程通常会产生错误的m值。对于这些模型，发现平均d非常接近于使用倒置的Jukes-Cantor公式从已知的s值估计的平均m的两倍。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

对于各种计算机生成的模型系统，一种不需要序列比对的差异度量的平均值是需要序列比对的传统错配计数平均值的两倍。

Average values of a dissimilarity measure not requiring sequence alignment are twice the averages of conventional mismatch counts requiring sequence alignment for a variety of computer-generated model systems.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

对于各种计算机生成的模型系统，一种不需要序列比对的差异度量的平均值是需要序列比对的传统错配计数平均值的两倍。

Average values of a dissimilarity measure not requiring sequence alignment are twice the averages of conventional mismatch counts requiring sequence alignment for a variety of computer-generated model systems.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献