使用p值合并证据：在序列同源性搜索中的应用。

Combining evidence using p-values: application to sequence homology searches.

作者信息

Bailey T L, Gribskov M

机构信息

San Diego Supercomputer Center, CA 92186-9784, USA.

出版信息

Bioinformatics. 1998;14(1):48-54. doi: 10.1093/bioinformatics/14.1.48.

DOI:10.1093/bioinformatics/14.1.48

PMID:9520501

Abstract

MOTIVATION

To illustrate an intuitive and statistically valid method for combining independent sources of evidence that yields a p-value for the complete evidence, and to apply it to the problem of detecting simultaneous matches to multiple patterns in sequence homology searches.

RESULTS

In sequence analysis, two or more (approximately) independent measures of the membership of a sequence (or sequence region) in some class are often available. We would like to estimate the likelihood of the sequence being a member of the class in view of all the available evidence. An example is estimating the significance of the observed match of a macromolecular sequence (DNA or protein) to a set of patterns (motifs) that characterize a biological sequence family. An intuitive way to do this is to express each piece of evidence as a p-value, and then use the product of these p-values as the measure of membership in the family. We derive a formula and algorithm (QFAST) for calculating the statistical distribution of the product of n independent p-values. We demonstrate that sorting sequences by this p-value effectively combines the information present in multiple motifs, leading to highly accurate and sensitive sequence homology searches.

摘要

动机

阐述一种直观且统计有效的方法，用于合并独立的证据来源，从而得出完整证据的p值，并将其应用于序列同源性搜索中检测与多个模式同时匹配的问题。

结果

在序列分析中，通常可以获得关于某个序列（或序列区域）属于某类的两个或更多（近似）独立度量。鉴于所有可用证据，我们希望估计该序列属于该类的可能性。一个例子是估计观察到的大分子序列（DNA或蛋白质）与一组表征生物序列家族的模式（基序）匹配的显著性。一种直观的方法是将每条证据表示为一个p值，然后使用这些p值的乘积作为属于该家族的度量。我们推导了一个公式和算法（QFAST）来计算n个独立p值乘积的统计分布。我们证明，按此p值对序列进行排序可有效合并多个基序中存在的信息，从而实现高度准确和灵敏的序列同源性搜索。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用p值合并证据：在序列同源性搜索中的应用。

Combining evidence using p-values: application to sequence homology searches.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

使用p值合并证据：在序列同源性搜索中的应用。

Combining evidence using p-values: application to sequence homology searches.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献