Guo Zhenxing, Cui Ying, Shi Xiaowen, Birchler James A, Albizua Igor, Sherman Stephanie L, Qin Zhaohui S, Ji Tieming
Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA.
Division of Biological Sciences, University of Missouri at Columbia, Columbia, MO 65211, USA.
NAR Genom Bioinform. 2020 Sep 18;2(3):lqaa072. doi: 10.1093/nargab/lqaa072. eCollection 2020 Sep.
We are motivated by biological studies intended to understand global gene expression fold change. Biologists have generally adopted a fixed cutoff to determine the significance of fold changes in gene expression studies (e.g. by using an observed fold change equal to two as a fixed threshold). Scientists can also use a -test or a modified differential expression test to assess the significance of fold changes. However, these methods either fail to take advantage of the high dimensionality of gene expression data or fail to test fold change directly. Our research develops a new empirical Bayesian approach to substantially improve the power and accuracy of fold-change detection. Specifically, we more accurately estimate gene-wise error variation in the log of fold change. We then adopt a -test with adjusted degrees of freedom for significance assessment. We apply our method to a dosage study in Arabidopsis and a Down syndrome study in humans to illustrate the utility of our approach. We also present a simulation study based on real datasets to demonstrate the accuracy of our method relative to error variance estimation and power in fold-change detection. Our developed R package with a detailed user manual is publicly available on GitHub at https://github.com/cuiyingbeicheng/Foldseq.
我们的研究动机源于旨在理解全局基因表达倍数变化的生物学研究。在基因表达研究中,生物学家通常采用固定的临界值来确定倍数变化的显著性(例如,将观察到的倍数变化等于2作为固定阈值)。科学家们也可以使用t检验或改良的差异表达检验来评估倍数变化的显著性。然而,这些方法要么未能利用基因表达数据的高维度特性,要么未能直接检验倍数变化。我们的研究开发了一种新的经验贝叶斯方法,以大幅提高倍数变化检测的功效和准确性。具体而言,我们更准确地估计了倍数变化对数中的基因特异性误差变异。然后,我们采用具有调整自由度的t检验进行显著性评估。我们将我们的方法应用于拟南芥的剂量研究和人类的唐氏综合征研究,以说明我们方法的实用性。我们还基于真实数据集进行了模拟研究,以证明我们的方法在倍数变化检测中相对于误差方差估计和功效的准确性。我们开发的带有详细用户手册的R包可在GitHub上公开获取,网址为https://github.com/cuiyingbeicheng/Foldseq。