Lutz Sharon M, Fingerlin Tasha E, Hokanson John E, Lange Christoph
Department of Biostatistics, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA.
Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA.
Genet Epidemiol. 2017 Feb;41(2):163-170. doi: 10.1002/gepi.22011. Epub 2016 Nov 30.
Through genome-wide association studies, numerous genes have been shown to be associated with multiple phenotypes. To determine the overlap of genetic susceptibility of correlated phenotypes, one can apply multivariate regression or dimension reduction techniques, such as principal components analysis, and test for the association with the principal components of the phenotypes rather than the individual phenotypes. However, as these approaches test whether there is a genetic effect for at least one of the phenotypes, a significant test result does not necessarily imply pleiotropy. Recently, a method called Pleiotropy Estimation and Test Bootstrap (PET-B) has been proposed to specifically test for pleiotropy (i.e., that two normally distributed phenotypes are both associated with the single nucleotide polymorphism of interest). Although the method examines the genetic overlap between the two quantitative phenotypes, the extension to binary phenotypes, three or more phenotypes, and rare variants is not straightforward. We provide two approaches to formally test this pleiotropic relationship in multiple scenarios. These approaches depend on permuting the phenotypes of interest and comparing the set of observed P-values to the set of permuted P-values in relation to the origin (e.g., a vector of zeros) either using the Hausdorff metric or a cutoff-based approach. These approaches are appropriate for categorical and quantitative phenotypes, more than two phenotypes, common variants and rare variants. We evaluate these approaches under various simulation scenarios and apply them to the COPDGene study, a case-control study of chronic obstructive pulmonary disease in current and former smokers.
通过全基因组关联研究,已表明许多基因与多种表型相关。为了确定相关表型的遗传易感性重叠情况,可以应用多变量回归或降维技术,如主成分分析,并检验与表型主成分而非单个表型的关联性。然而,由于这些方法检验的是至少一种表型是否存在遗传效应,显著的检验结果并不一定意味着多效性。最近,一种名为多效性估计与检验自展法(PET-B)的方法被提出来专门检验多效性(即两种正态分布的表型都与感兴趣的单核苷酸多态性相关)。尽管该方法研究了两种定量表型之间的遗传重叠,但将其扩展到二元表型、三种或更多表型以及罕见变异并非易事。我们提供了两种方法来在多种情况下正式检验这种多效性关系。这些方法依赖于对感兴趣的表型进行置换,并使用豪斯多夫度量或基于临界值的方法,将观察到的P值集与相对于原点(例如零向量)的置换P值集进行比较。这些方法适用于分类和定量表型、两种以上表型、常见变异和罕见变异。我们在各种模拟情况下评估了这些方法,并将它们应用于慢性阻塞性肺疾病基因(COPDGene)研究,这是一项针对当前和既往吸烟者的慢性阻塞性肺疾病病例对照研究。