Dobbin Kevin, Simon Richard
Biometric Research Branch, National Cancer Institute, 6130 Executive Blvd., Bethesda, MD, 20892-7434, USA.
Biostatistics. 2005 Jan;6(1):27-38. doi: 10.1093/biostatistics/kxh015.
Determining sample sizes for microarray experiments is important but the complexity of these experiments, and the large amounts of data they produce, can make the sample size issue seem daunting, and tempt researchers to use rules of thumb in place of formal calculations based on the goals of the experiment. Here we present formulae for determining sample sizes to achieve a variety of experimental goals, including class comparison and the development of prognostic markers. Results are derived which describe the impact of pooling, technical replicates and dye-swap arrays on sample size requirements. These results are shown to depend on the relative sizes of different sources of variability. A variety of common types of experimental situations and designs used with single-label and dual-label microarrays are considered. We discuss procedures for controlling the false discovery rate. Our calculations are based on relatively simple yet realistic statistical models for the data, and provide straightforward sample size calculation formulae.
确定微阵列实验的样本量很重要,但这些实验的复杂性以及它们产生的大量数据,可能会使样本量问题看起来令人生畏,并诱使研究人员使用经验法则来代替基于实验目标的正式计算。在此,我们给出用于确定样本量以实现各种实验目标的公式,包括类别比较和预后标志物的开发。得出的结果描述了合并、技术重复和染料交换阵列对样本量要求的影响。这些结果表明取决于不同变异性来源的相对大小。考虑了与单标记和双标记微阵列一起使用的各种常见类型的实验情况和设计。我们讨论了控制错误发现率的程序。我们的计算基于相对简单但现实的数据统计模型,并提供了直接的样本量计算公式。