Lin Yi-Ting, Lee Wen-Chung
Research Center for Genes, Environment and Human Health and Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Rm. 536, No. 17, Xuzhou Rd., Taipei, 100, Taiwan.
BMC Genet. 2015 Aug 4;16:97. doi: 10.1186/s12863-015-0259-z.
Multiple hypothesis testing is a pervasive problem in genomic data analysis. The conventional Bonferroni method which controls the family-wise error rate is conservative and with low power. The current paradigm is to control the false discovery rate.
We characterize the variability of the false discovery rate indices (local false discovery rates, q-value and false discovery proportion) using the bootstrapped method. A colon cancer gene-expression data and a visual refractive errors genome-wide association study data are analyzed as demonstration. We found a high variability in false discovery rate controls for typical genomic studies.
We advise researchers to present the bootstrapped standard errors alongside with the false discovery rate indices.
多重假设检验是基因组数据分析中普遍存在的问题。控制家族性错误率的传统邦费罗尼方法较为保守且功效较低。当前的范式是控制错误发现率。
我们使用自助法对错误发现率指标(局部错误发现率、q值和错误发现比例)的变异性进行了表征。作为示例,分析了一项结肠癌基因表达数据和一项视觉屈光不正全基因组关联研究数据。我们发现,对于典型的基因组研究,错误发现率控制存在高度变异性。
我们建议研究人员在呈现错误发现率指标的同时,也给出自助法标准误。