Molecular Microbiology and Biotechnology Department, Tel-Aviv University, Tel Aviv 69978, Israel.
Genetics. 2013 Jul;194(3):769-79. doi: 10.1534/genetics.113.150169. Epub 2013 May 1.
Deep sequencing technologies enable the study of the effects of rare variants in disease risk. While methods have been developed to increase statistical power for detection of such effects, detecting subtle associations requires studies with hundreds or thousands of individuals, which is prohibitively costly. Recently, low-coverage sequencing has been shown to effectively reduce the cost of genome-wide association studies, using current sequencing technologies. However, current methods for disease association testing on rare variants cannot be applied directly to low-coverage sequencing data, as they require individual genotype data, which may not be called correctly due to low-coverage and inherent sequencing errors. In this article, we propose two novel methods for detecting association of rare variants with disease risk, using low coverage, error-prone sequencing. We show by simulation that our methods outperform previous methods under both low- and high-coverage sequencing and under different disease architectures. We use real data and simulation studies to demonstrate that to maximize the power to detect associations for a fixed budget, it is desirable to include more samples while lowering coverage and to perform an analysis using our suggested methods.
深度测序技术使研究疾病风险中罕见变异的影响成为可能。虽然已经开发了一些方法来提高检测这些影响的统计能力,但检测微妙的关联需要数百或数千人的研究,这是非常昂贵的。最近,低覆盖率测序已被证明可以有效地降低使用当前测序技术进行全基因组关联研究的成本。然而,当前用于罕见变异疾病关联测试的方法不能直接应用于低覆盖率测序数据,因为它们需要个体基因型数据,而由于低覆盖率和固有的测序错误,这些数据可能无法正确调用。在本文中,我们提出了两种利用低覆盖率、易错测序检测罕见变异与疾病风险关联的新方法。我们通过模拟表明,在低覆盖和高覆盖测序以及不同疾病结构下,我们的方法都优于以前的方法。我们使用真实数据和模拟研究表明,为了在固定预算下最大化检测关联的能力,最好在降低覆盖率的同时增加更多的样本,并使用我们建议的方法进行分析。