Department of Biostatistics, The University of North Carolina at Chapel Hill, 27599, USA.
Am J Hum Genet. 2011 Jul 15;89(1):82-93. doi: 10.1016/j.ajhg.2011.05.029. Epub 2011 Jul 7.
Sequencing studies are increasingly being conducted to identify rare variants associated with complex traits. The limited power of classical single-marker association analysis for rare variants poses a central challenge in such studies. We propose the sequence kernel association test (SKAT), a supervised, flexible, computationally efficient regression method to test for association between genetic variants (common and rare) in a region and a continuous or dichotomous trait while easily adjusting for covariates. As a score-based variance-component test, SKAT can quickly calculate p values analytically by fitting the null model containing only the covariates, and so can easily be applied to genome-wide data. Using SKAT to analyze a genome-wide sequencing study of 1000 individuals, by segmenting the whole genome into 30 kb regions, requires only 7 hr on a laptop. Through analysis of simulated data across a wide range of practical scenarios and triglyceride data from the Dallas Heart Study, we show that SKAT can substantially outperform several alternative rare-variant association tests. We also provide analytic power and sample-size calculations to help design candidate-gene, whole-exome, and whole-genome sequence association studies.
测序研究越来越多地用于识别与复杂性状相关的罕见变异。在这种研究中,经典的单标记关联分析对罕见变异的有限功效构成了核心挑战。我们提出了序列核关联测试(SKAT),这是一种受监督的、灵活的、计算效率高的回归方法,用于测试遗传变异(常见和罕见)在一个区域与连续或二分类性状之间的关联,同时轻松调整协变量。作为基于评分的方差分量检验,SKAT 可以通过拟合仅包含协变量的零模型来快速分析 p 值,因此可以轻松应用于全基因组数据。使用 SKAT 对 1000 个人的全基因组测序研究进行分析,通过将整个基因组划分为 30 kb 区域,仅在笔记本电脑上需要 7 小时。通过对广泛的实际情况的模拟数据和达拉斯心脏研究中的甘油三酯数据进行分析,我们表明 SKAT 可以大大优于几种替代的罕见变异关联测试。我们还提供了分析能力和样本量计算,以帮助设计候选基因、外显子组和全基因组序列关联研究。