蛋白质序列相似性的意义。

The significance of protein sequence similarities.

作者信息

Collins J F, Coulson A F, Lyall A

机构信息

Department of Molecular Biology, University of Edinburgh, UK.

出版信息

Comput Appl Biosci. 1988 Mar;4(1):67-71. doi: 10.1093/bioinformatics/4.1.67.

DOI:10.1093/bioinformatics/4.1.67

Abstract

A general method of assessing the significance of scored best local alignments, particularly suited to protein sequence comparisons, is described. The method establishes the parameters describing the distribution of the best results from any search program, provided that the set is sufficiently large and the majority of the alignments arise from unrelated sequences. The expected frequency of occurrence of any score can then be calculated, together with the number of standard deviations above expectation. These provide sensible measures of significance without additional search operations. However the biological significance of any alignment or set of alignments does not solely depend on the improbability of the alignment, but on all relevant factors known to the biologist.

摘要

本文描述了一种评估带分数的最佳局部比对显著性的通用方法，该方法特别适用于蛋白质序列比较。该方法建立了描述任何搜索程序最佳结果分布的参数，前提是数据集足够大且大多数比对来自不相关序列。然后可以计算任何分数出现的预期频率，以及高于预期的标准差数量。这些提供了合理的显著性度量，无需额外的搜索操作。然而，任何比对或比对集的生物学显著性不仅取决于比对的不可能性，还取决于生物学家已知的所有相关因素。

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验