Department of Statistics, Stanford University, Stanford, California, United States of America; Department of Internal Medicine, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa, United States of America.
PLoS One. 2006 Dec 20;1(1):e88. doi: 10.1371/journal.pone.0000088.
There is great current interest in developing microarray platforms for measuring mRNA abundance at both gene level and exon level. The Affymetrix Exon Array is a new high-density gene expression microarray platform, with over six million probes targeting all annotated and predicted exons in a genome. An important question for the analysis of exon array data is how to compute overall gene expression indexes. Because of the complexity of the design of exon array probes, this problem is different in nature from summarizing gene-level expression from traditional 3' expression arrays.
METHODOLOGY/PRINCIPAL FINDINGS: In this manuscript, we use exon array data from 11 human tissues to study methods for computing gene-level expression. We showed that for most genes there is a subset of exon array probes having highly correlated intensities across multiple samples. We suggest that these probes could be used as reliable indicators of overall gene expression levels. We developed a probe selection algorithm to select such a subset of highly correlated probes for each gene, and computed gene expression indexes using the selected probes.
CONCLUSIONS/SIGNIFICANCE: Our results demonstrate that probe selection improves gene expression estimates from exon arrays. The selected probes can be used in future analyses of other exon array datasets to compute gene expression indexes.
目前人们对开发用于测量基因水平和外显子水平 mRNA 丰度的微阵列平台非常感兴趣。Affymetrix Exon Array 是一种新的高密度基因表达微阵列平台,其针对基因组中所有注释和预测的外显子设计了超过 600 万个探针。exon 阵列数据分析的一个重要问题是如何计算整体基因表达指数。由于 exon 阵列探针设计的复杂性,这个问题本质上与从传统的 3' 表达阵列中总结基因水平的表达不同。
方法/主要发现:在本文中,我们使用来自 11 个人类组织的 exon 阵列数据来研究计算基因水平表达的方法。我们表明,对于大多数基因,在外显子阵列探针中存在一组具有高度相关强度的探针子集。我们建议这些探针可以作为整体基因表达水平的可靠指标。我们开发了一种探针选择算法,为每个基因选择这样一个具有高度相关性的探针子集,并使用所选探针计算基因表达指数。
结论/意义:我们的结果表明,探针选择可以提高外显子阵列的基因表达估计。所选探针可用于未来分析其他 exon 阵列数据集,以计算基因表达指数。