Li Ming, Wei Changshuai, Wen Yalu, Wang Tong, Lu Qing
1Department of Epidemiology and Biostatistics, Indiana University at Bloomington, Bloomington, IN 47405, U.S.A; 2Department of Epidemiology and Biostatistics, University of North Texas Health Science Center, Fort Worth, TX 76107, U.S.A; 3Department of Statistics, University of Auckland, Auckland 1010, New Zealand; 4Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, P.R. China; 5Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, U.S.A.
Curr Genomics. 2016 Oct;17(5):403-415. doi: 10.2174/1389202917666160513100946.
Many complex diseases, such as psychiatric and behavioral disorders, are commonly characterized through various measurements that reflect physical, behavioral and psychological aspects of diseases. While it remains a great challenge to find a unified measurement to characterize a disease, the available multiple phenotypes can be analyzed jointly in the genetic association study. Simultaneously testing these phenotypes has many advantages, including considering different aspects of the disease in the analysis, and utilizing correlated phenotypes to improve the power of detecting disease-associated variants. Furthermore, complex diseases are likely caused by the interplay of multiple genetic variants through complicated mechanisms. Considering gene-gene interactions in the joint association analysis of complex diseases could further increase our ability to discover genetic variants involving complex disease pathways. In this article, we propose a stepwise U-test for joint association analysis of multiple loci and multiple phenotypes. Through simulations, we demonstrated that testing multiple phenotypes simultaneously could attain higher power than testing one single phenotype at a time, especially when there are shared genes contributing to multiple phenotypes. We also illustrated the proposed method with an application to Nicotine Dependence (ND), using datasets from the Study of Addition, Genetics and Environment (SAGE). The joint analysis of three ND phenotypes identified two SNPs, rs10508649 and rs2491397, and reached a nominal -value of 3.79e-13. The association was further replicated in two independent datasets with -values of 2.37e-05 and 7.46e-05.
许多复杂疾病,如精神和行为障碍,通常通过反映疾病身体、行为和心理方面的各种测量来表征。虽然找到一种统一的测量方法来表征疾病仍然是一项巨大的挑战,但在基因关联研究中可以对现有的多种表型进行联合分析。同时检测这些表型有许多优点,包括在分析中考虑疾病的不同方面,以及利用相关表型来提高检测疾病相关变异的能力。此外,复杂疾病可能是由多种基因变异通过复杂机制相互作用引起的。在复杂疾病的联合关联分析中考虑基因-基因相互作用可以进一步提高我们发现涉及复杂疾病途径的基因变异的能力。在本文中,我们提出了一种用于多个基因座和多个表型联合关联分析的逐步U检验。通过模拟,我们证明同时检测多个表型比一次检测一个单一表型具有更高的效能,特别是当存在对多种表型有贡献的共享基因时。我们还通过应用来自成瘾、遗传学和环境研究(SAGE)的数据集,将所提出的方法应用于尼古丁依赖(ND)进行了说明。对三种ND表型的联合分析确定了两个单核苷酸多态性(SNP),即rs10508649和rs2491397,并且达到了名义P值为3.79e-13。该关联在两个独立数据集中进一步得到验证,P值分别为2.37e-05和7.46e-05。