Kerr M K, Churchill G A
The Jackson Laboratory, Bar Harbor, ME 04609, USA.
Proc Natl Acad Sci U S A. 2001 Jul 31;98(16):8961-5. doi: 10.1073/pnas.161273698. Epub 2001 Jul 24.
We introduce a general technique for making statistical inference from clustering tools applied to gene expression microarray data. The approach utilizes an analysis of variance model to achieve normalization and estimate differential expression of genes across multiple conditions. Statistical inference is based on the application of a randomization technique, bootstrapping. Bootstrapping has previously been used to obtain confidence intervals for estimates of differential expression for individual genes. Here we apply bootstrapping to assess the stability of results from a cluster analysis. We illustrate the technique with a publicly available data set and draw conclusions about the reliability of clustering results in light of variation in the data. The bootstrapping procedure relies on experimental replication. We discuss the implications of replication and good design in microarray experiments.
我们介绍了一种用于从应用于基因表达微阵列数据的聚类工具进行统计推断的通用技术。该方法利用方差分析模型来实现归一化,并估计多个条件下基因的差异表达。统计推断基于一种随机化技术——自展法的应用。自展法此前已被用于获取单个基因差异表达估计值的置信区间。在这里,我们应用自展法来评估聚类分析结果的稳定性。我们用一个公开可用的数据集说明了该技术,并根据数据中的变化对聚类结果的可靠性得出结论。自展程序依赖于实验重复。我们讨论了微阵列实验中重复和良好设计的意义。