Swartz Michael D, Kim Taebeom, Niu Jiangong, Yu Robert K, Shete Sanjay, Ionita-Laza Iuliana
Division of Biostatistics, University of Texas School of Public Health, Houston, TX 77025, USA ; Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Division of Biostatistics, University of Texas School of Public Health, Houston, TX 77025, USA.
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S13. doi: 10.1186/1753-6561-8-S1-S13. eCollection 2014.
We are now well into the sequencing era of genetic analysis, and methods to investigate rare variants associated with disease remain in high demand. Currently, the more common rare variant analysis methods are burden tests and variance component tests. This report introduces a burden test known as the modified replication based sum statistic and evaluates its performance, and the performance of other common burden and variance component tests under the setting of a small sample size (103 total cases and controls) using the Genetic Analysis Workshop 18 simulated data with complete knowledge of the simulation model. Specifically we look at the variable threshold sum statistic, replication-based sum statistics, the C-alpha, and sequence kernel association test. Using minor allele frequency thresholds of less than 0.05, we find that the modified replication based sum statistic is competitive with all methods and that using 103 individuals leads to all methods being vastly underpowered. Much larger sample sizes are needed to confidently find truly associated genes.
我们现在已深入进入基因分析的测序时代,对于研究与疾病相关的罕见变异的方法仍有很高的需求。目前,较常见的罕见变异分析方法是负担检验和方差成分检验。本报告介绍了一种称为基于修正复制的和统计量的负担检验,并评估其性能,以及在样本量较小(共103例病例和对照)的情况下,使用已知模拟模型的遗传分析研讨会18模拟数据,评估其他常见负担检验和方差成分检验的性能。具体而言,我们考察了可变阈值和统计量、基于复制的和统计量、C-α以及序列核关联检验。使用小于0.05的次要等位基因频率阈值,我们发现基于修正复制的和统计量与所有方法相比具有竞争力,并且使用103个个体导致所有方法的效能都极低。需要大得多的样本量才能可靠地找到真正相关的基因。