Deng Yamin, He Tao, Fang Ruiling, Li Shaoyu, Cao Hongyan, Cui Yuehua
Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China.
Department of Mathematics, San Francisco State University, San Francisco, CA, United States.
Front Genet. 2020 May 19;11:437. doi: 10.3389/fgene.2020.00437. eCollection 2020.
Genome-wide association studies focusing on a single phenotype have been broadly conducted to identify genetic variants associated with a complex disease. The commonly applied single variant analysis is limited by failing to consider the complex interactions between variants, which motivated the development of association analyses focusing on genes or gene sets. Moreover, when multiple correlated phenotypes are available, methods based on a multi-trait analysis can improve the association power. However, most currently available multi-trait analyses are single variant-based analyses; thus have limited power when disease variants function as a group in a gene or a gene set. In this work, we propose a genome-wide gene-based multi-trait analysis method by considering genes as testing units. For a given phenotype, we adopt a rapid and powerful kernel-based testing method which can evaluate the joint effect of multiple variants within a gene. The joint effect, either linear or nonlinear, is captured through kernel functions. Given a series of candidate kernel functions, we propose an omnibus test strategy to integrate the test results based on different candidate kernels. A -value combination method is then applied to integrate dependent -values to assess the association between a gene and multiple correlated phenotypes. Simulation studies show a reasonable type I error control and an excellent power of the proposed method compared to its counterparts. We further show the utility of the method by applying it to two data sets: the Human Liver Cohort and the Alzheimer Disease Neuroimaging Initiative data set, and novel genes are identified. Our method has broad applications in other fields in which the interest is to evaluate the joint effect (linear or nonlinear) of a set of variants.
针对单一表型的全基因组关联研究已广泛开展,以识别与复杂疾病相关的基因变异。常用的单变异分析存在局限性,因为它没有考虑变异之间的复杂相互作用,这促使了专注于基因或基因集的关联分析的发展。此外,当有多个相关表型时,基于多性状分析的方法可以提高关联效能。然而,目前大多数可用的多性状分析都是基于单变异的分析;因此,当疾病变异在基因或基因集中作为一个群体发挥作用时,其效能有限。在这项工作中,我们提出了一种基于全基因组基因的多性状分析方法,将基因作为测试单元。对于给定的表型,我们采用一种快速且强大的基于核的测试方法,该方法可以评估基因内多个变异的联合效应。通过核函数捕捉联合效应,无论是线性还是非线性的。给定一系列候选核函数,我们提出一种综合测试策略,以整合基于不同候选核的测试结果。然后应用P值组合方法来整合相关的P值,以评估基因与多个相关表型之间的关联。模拟研究表明,与其他方法相比,所提出的方法具有合理的I型错误控制和出色的效能。我们通过将该方法应用于两个数据集(人类肝脏队列和阿尔茨海默病神经影像倡议数据集)进一步展示了该方法的实用性,并识别出了新的基因。我们的方法在其他旨在评估一组变异的联合效应(线性或非线性)的领域具有广泛的应用。