Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA.
Genet Epidemiol. 2013 Feb;37(2):196-204. doi: 10.1002/gepi.21703. Epub 2012 Dec 26.
A large number of rare genetic variants have been discovered with the development in sequencing technology and the lowering of sequencing costs. Rare variant analysis may help identify novel genes associated with diseases and quantitative traits, adding to our knowledge of explaining heritability of these phenotypes. Many statistical methods for rare variant analysis have been developed in recent years, but some of them require the strong assumption that all rare variants in the analysis share the same direction of effect, and others requiring permutation to calculate the P-values are computer intensive. Among these methods, the sequence kernel association test (SKAT) is a powerful method under many different scenarios. It does not require any assumption on the directionality of effects, and statistical significance is computed analytically. In this paper, we extend SKAT to be applicable to family data. The family-based SKAT (famSKAT) has a different test statistic and null distribution compared to SKAT, but is equivalent to SKAT when there is no familial correlation. Our simulation studies show that SKAT has inflated type I error if familial correlation is inappropriately ignored, but has appropriate type I error if applied to a single individual per family to obtain an unrelated subset. In contrast, famSKAT has the correct type I error when analyzing correlated observations, and it has higher power than competing methods in many different scenarios. We illustrate our approach to analyze the association of rare genetic variants using glycemic traits from the Framingham Heart Study.
随着测序技术的发展和测序成本的降低,大量的罕见遗传变异已经被发现。罕见变异分析可以帮助确定与疾病和数量性状相关的新基因,增加我们对这些表型遗传力的解释。近年来已经开发出许多用于罕见变异分析的统计方法,但其中一些方法需要强烈假设分析中所有罕见变异的效应方向相同,而其他需要通过置换来计算 P 值的方法则需要大量的计算资源。在这些方法中,序列核关联检验(SKAT)是一种在许多不同情况下都很强大的方法。它不需要对效应方向进行任何假设,并且可以通过分析计算统计显著性。在本文中,我们将 SKAT 扩展到适用于家族数据。基于家庭的 SKAT(famSKAT)与 SKAT 的检验统计量和零分布不同,但在没有家族相关性时与 SKAT 等效。我们的模拟研究表明,如果不恰当地忽略家族相关性,SKAT 会导致Ⅰ型错误膨胀,但如果将其应用于每个家庭中的单个个体以获得无关联子集,则 SKAT 会有适当的Ⅰ型错误。相比之下,famSKAT 在分析相关观测值时具有正确的Ⅰ型错误,并且在许多不同情况下比竞争方法具有更高的功效。我们使用弗雷明汉心脏研究中的血糖特征来举例说明我们分析罕见遗传变异相关性的方法。