Department of Biostatistics, University of North Carolina, 3101 McGavran-Greenberg Hall, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA and Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA.
Bioinformatics. 2014 Feb 15;30(4):480-7. doi: 10.1093/bioinformatics/btt719. Epub 2013 Dec 12.
Despite its great capability to detect rare variant associations, next-generation sequencing is still prohibitively expensive when applied to large samples. In case-control studies, it is thus appealing to sequence only a subset of cases to discover variants and genotype the identified variants in controls and the remaining cases under the reasonable assumption that causal variants are usually enriched among cases. However, this approach leads to inflated type-I error if analyzed naively for rare variant association. Several methods have been proposed in recent literature to control type-I error at the cost of either excluding some sequenced cases or correcting the genotypes of discovered rare variants. All of these approaches thus suffer from certain extent of information loss and thus are underpowered. We propose a novel method (BETASEQ), which corrects inflation of type-I error by supplementing pseudo-variants while keeps the original sequence and genotype data intact. Extensive simulations and real data analysis demonstrate that, in most practical situations, BETASEQ leads to higher testing powers than existing approaches with guaranteed (controlled or conservative) type-I error.
BETASEQ and associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/betaseq
尽管下一代测序技术在检测罕见变异关联方面具有强大的能力,但当应用于大样本时,其成本仍然过高。因此,在病例对照研究中,仅对部分病例进行测序以发现变异,并对对照组和其余病例中的已识别变异进行基因分型,这是一种很有吸引力的方法。合理的假设是,因果变异通常在病例中富集。然而,如果对罕见变异关联进行简单的分析,这种方法会导致Ⅰ型错误率膨胀。最近的文献中提出了几种方法来控制Ⅰ型错误,代价是排除一些测序病例或纠正发现的罕见变异的基因型。所有这些方法都因此存在一定程度的信息丢失,因此功效不足。我们提出了一种新的方法(BETASEQ),它通过补充伪变体来纠正Ⅰ型错误的膨胀,同时保持原始序列和基因型数据的完整性。广泛的模拟和真实数据分析表明,在大多数实际情况下,BETASEQ 比现有的方法具有更高的检验功效,同时保证了(控制或保守)Ⅰ型错误率。
BETASEQ 及其相关的 R 文件,包括文档、示例,可在 http://www.unc.edu/~yunmli/betaseq 获得。