在利用微阵列数据进行癌症分类时基因重要性方面的齐普夫定律。

Zipf's law in importance of genes for cancer classification using microarray data.

作者信息

Li Wentian, Yang Yaning

机构信息

Center for Genomics and Human Genetics North Shore LIJ Research Institute, 350 Community Drive, Manhasset, NY 11030, USA.

出版信息

J Theor Biol. 2002 Dec 21;219(4):539-51. doi: 10.1006/jtbi.2002.3145.

DOI:10.1006/jtbi.2002.3145

PMID:12425984

Abstract

Using a measure of how differentially expressed a gene is in two biochemically/phenotypically different conditions, we can rank all genes in a microarray dataset. We have shown that the falling-off of this measure (normalized maximum likelihood in a classification model such as logistic regression) as a function of the rank is typically a power-law function. This power-law function in other similar ranked plots are known as the Zipf's law, observed in many natural and social phenomena. The presence of this power-law function prevents an intrinsic cutoff point between the "important" genes and "irrelevant" genes. We have shown that similar power-law functions are also present in permuted dataset, and provide an explanation from the well-known chi(2) distribution of likelihood ratios. We discuss the implication of this Zipf's law on gene selection in a microarray data analysis, as well as other characterizations of the ranked likelihood plots such as the rate of fall-off of the likelihood.

摘要

通过测量一个基因在两种生物化学/表型不同的条件下的差异表达程度，我们可以对微阵列数据集中的所有基因进行排名。我们已经表明，这种测量值（如逻辑回归等分类模型中的归一化最大似然值）作为排名的函数下降，通常是一个幂律函数。在其他类似的排名图中，这种幂律函数被称为齐普夫定律，在许多自然和社会现象中都有观察到。这种幂律函数的存在阻止了“重要”基因和“无关”基因之间的内在分界点。我们已经表明，类似的幂律函数也存在于置换数据集中，并从似然比的著名卡方分布中给出了解释。我们讨论了这种齐普夫定律在微阵列数据分析中基因选择的意义，以及排名似然图的其他特征，如似然下降率。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

在利用微阵列数据进行癌症分类时基因重要性方面的齐普夫定律。

Zipf's law in importance of genes for cancer classification using microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

在利用微阵列数据进行癌症分类时基因重要性方面的齐普夫定律。

Zipf's law in importance of genes for cancer classification using microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献