Li Chung-I, Su Pei-Fang, Guo Yan, Shyr Yu
Department of Applied Mathematics, National Chiayi University, No. 300, Xuefu Rd., East Dist., Chiayi City, Taiwan.
Int J Comput Biol Drug Des. 2013;6(4):358-75. doi: 10.1504/IJCBDD.2013.056830. Epub 2013 Sep 30.
Sample size determination is an important issue in the experimental design of biomedical research. Because of the complexity of RNA-seq experiments, however, the field currently lacks a sample size method widely applicable to differential expression studies utilising RNA-seq technology. In this report, we propose several methods for sample size calculation for single-gene differential expression analysis of RNA-seq data under Poisson distribution. These methods are then extended to multiple genes, with consideration for addressing the multiple testing problem by controlling false discovery rate. Moreover, most of the proposed methods allow for closed-form sample size formulas with specification of the desired minimum fold change and minimum average read count, and thus are not computationally intensive. Simulation studies to evaluate the performance of the proposed sample size formulas are presented; the results indicate that our methods work well, with achievement of desired power. Finally, our sample size calculation methods are applied to three real RNA-seq data sets.
样本量确定是生物医学研究实验设计中的一个重要问题。然而,由于RNA测序实验的复杂性,目前该领域缺乏一种广泛适用于利用RNA测序技术进行差异表达研究的样本量计算方法。在本报告中,我们提出了几种在泊松分布下对RNA测序数据进行单基因差异表达分析的样本量计算方法。这些方法随后被扩展到多个基因,并考虑通过控制错误发现率来解决多重检验问题。此外,大多数提出的方法都允许使用封闭形式的样本量公式,只需指定所需的最小倍数变化和最小平均读数计数,因此计算量不大。本文还进行了模拟研究以评估所提出的样本量公式的性能;结果表明我们的方法效果良好,能够达到所需的检验效能。最后,我们将样本量计算方法应用于三个真实的RNA测序数据集。