Wei Caimiao, Li Jiangning, Bumgarner Roger E
Department of Microbiology, University of Washington, Seattle, WA 98195, USA.
BMC Genomics. 2004 Nov 8;5:87. doi: 10.1186/1471-2164-5-87.
Microarray experiments are often performed with a small number of biological replicates, resulting in low statistical power for detecting differentially expressed genes and concomitant high false positive rates. While increasing sample size can increase statistical power and decrease error rates, with too many samples, valuable resources are not used efficiently. The issue of how many replicates are required in a typical experimental system needs to be addressed. Of particular interest is the difference in required sample sizes for similar experiments in inbred vs. outbred populations (e.g. mouse and rat vs. human).
We hypothesize that if all other factors (assay protocol, microarray platform, data pre-processing) were equal, fewer individuals would be needed for the same statistical power using inbred animals as opposed to unrelated human subjects, as genetic effects on gene expression will be removed in the inbred populations. We apply the same normalization algorithm and estimate the variance of gene expression for a variety of cDNA data sets (humans, inbred mice and rats) comparing two conditions. Using one sample, paired sample or two independent sample t-tests, we calculate the sample sizes required to detect a 1.5-, 2-, and 4-fold changes in expression level as a function of false positive rate, power and percentage of genes that have a standard deviation below a given percentile.
Factors that affect power and sample size calculations include variability of the population, the desired detectable differences, the power to detect the differences, and an acceptable error rate. In addition, experimental design, technical variability and data pre-processing play a role in the power of the statistical tests in microarrays. We show that the number of samples required for detecting a 2-fold change with 90% probability and a p-value of 0.01 in humans is much larger than the number of samples commonly used in present day studies, and that far fewer individuals are needed for the same statistical power when using inbred animals rather than unrelated human subjects.
微阵列实验通常使用少量生物重复样本进行,这导致检测差异表达基因的统计效力较低,同时假阳性率较高。虽然增加样本量可以提高统计效力并降低错误率,但样本过多会导致宝贵资源利用效率低下。需要解决典型实验系统中所需重复样本数量的问题。特别值得关注的是近交系与远交群体(如小鼠、大鼠与人)中类似实验所需样本量的差异。
我们假设,如果所有其他因素(检测方案、微阵列平台、数据预处理)相同,与无关人类受试者相比,使用近交动物获得相同统计效力所需的个体数量会更少,因为近交群体中基因表达的遗传效应将被消除。我们应用相同的归一化算法,并估计比较两种条件下的各种cDNA数据集(人类、近交小鼠和大鼠)的基因表达方差。使用单样本、配对样本或两独立样本t检验,我们计算检测表达水平1.5倍、2倍和4倍变化所需的样本量,该样本量是假阳性率、效力以及标准差低于给定百分位数的基因百分比的函数。
影响效力和样本量计算的因素包括群体变异性、期望检测到的差异、检测差异的效力以及可接受的错误率。此外,实验设计、技术变异性和数据预处理对微阵列统计检验的效力也有影响。我们表明,在人类中以90%的概率检测到2倍变化且p值为0.01所需的样本数量远大于当今研究中常用的样本数量,并且使用近交动物而非无关人类受试者获得相同统计效力时所需的个体数量要少得多。