He Tao, Li Shaoyu, Zhong Ping-Shou, Cui Yuehua
Department of Mathematics, San Francisco State University, San Francisco, California.
Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, North Carolina.
Genet Epidemiol. 2019 Mar;43(2):137-149. doi: 10.1002/gepi.22170. Epub 2018 Nov 19.
Single-variant-based genome-wide association studies have successfully detected many genetic variants that are associated with a number of complex traits. However, their power is limited due to weak marginal signals and ignoring potential complex interactions among genetic variants. The set-based strategy was proposed to provide a remedy where multiple genetic variants in a given set (e.g., gene or pathway) are jointly evaluated, so that the systematic effect of the set is considered. Among many, the kernel-based testing (KBT) framework is one of the most popular and powerful methods in set-based association studies. Given a set of candidate kernels, the method has been proposed to choose the one with the smallest p-value. Such a method, however, can yield inflated Type 1 error, especially when the number of variants in a set is large. Alternatively one can get p values by permutations which, however, could be very time-consuming. In this study, we proposed an efficient testing procedure that cannot only control Type 1 error rate but also have power close to the one obtained under the optimal kernel in the candidate kernel set, for quantitative trait association studies. Our method, a maximum kernel-based U-statistic method, is built upon the KBT framework and is based on asymptotic results under a high-dimensional setting. Hence it can efficiently deal with the case where the number of variants in a set is much larger than the sample size. Both simulation and real data analysis demonstrate the advantages of the method compared with its counterparts.
基于单变异体的全基因组关联研究已成功检测出许多与多种复杂性状相关的遗传变异。然而,由于边际信号微弱以及忽略了遗传变异之间潜在的复杂相互作用,其效能有限。基于集合的策略被提出来作为一种补救方法,即对给定集合(如基因或通路)中的多个遗传变异进行联合评估,从而考虑该集合的系统性效应。在众多方法中,基于核的检验(KBT)框架是基于集合的关联研究中最流行且强大的方法之一。给定一组候选核,该方法被提议选择p值最小的那个核。然而,这种方法可能会导致第一类错误膨胀,尤其是当一个集合中的变异数量很大时。另外,也可以通过排列得到p值,不过这可能非常耗时。在本研究中,我们提出了一种高效的检验程序,对于数量性状关联研究,它不仅能控制第一类错误率,而且效能接近在候选核集合中的最优核下所获得的效能。我们的方法,即基于最大核的U统计量方法,是建立在KBT框架之上的,并且基于高维情况下的渐近结果。因此,它能够有效地处理一个集合中的变异数量远大于样本量的情况。模拟和实际数据分析都证明了该方法相对于其他方法的优势。