Brief Bioinform. 2017 Nov 1;18(6):954-961. doi: 10.1093/bib/bbw083.
The objective of this article is to introduce valid and robust methods for the analysis of rare variants for family-based exome chips, whole-exome sequencing or whole-genome sequencing data. Family-based designs provide unique opportunities to detect genetic variants that complement studies of unrelated individuals. Currently, limited methods and software tools have been developed to assist family-based association studies with rare variants, especially for analyzing binary traits. In this article, we address this gap by extending existing burden and kernel-based gene set association tests for population data to related samples, with a particular emphasis on binary phenotypes. The proposed approach blends the strengths of kernel machine methods and generalized estimating equations. Importantly, the efficient generalized kernel score test can be applied as a mega-analysis framework to combine studies with different designs. We illustrate the application of the proposed method using data from an exome sequencing study of autism. Methods discussed in this article are implemented in an R package 'gskat', which is available on CRAN and GitHub.
本文的目的是介绍有效的、稳健的方法,用于分析基于家系的外显子芯片、全外显子测序或全基因组测序数据中的罕见变异。基于家系的设计为检测补充了无关个体研究的遗传变异提供了独特的机会。目前,已经开发了有限的方法和软件工具来协助基于家系的罕见变异关联研究,特别是用于分析二分类性状。在本文中,我们通过将现有的基于核的基因集关联测试方法从群体数据扩展到相关样本,特别是针对二分类表型,来解决这一差距。所提出的方法融合了核机器方法和广义估计方程的优势。重要的是,高效的广义核得分检验可以作为一个大型分析框架,将具有不同设计的研究结合起来。我们使用自闭症外显子测序研究的数据说明了所提出方法的应用。本文讨论的方法在一个名为“gskat”的 R 包中实现,该包可在 CRAN 和 GitHub 上获得。