Bioinformatics Graduate Program, Boston University, Boston, MA, USA.
Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, MA, USA.
Eur J Hum Genet. 2019 May;27(5):811-823. doi: 10.1038/s41431-018-0327-8. Epub 2019 Jan 25.
Complex diseases are usually associated with multiple correlated phenotypes, and the analysis of composite scores or disease status may not fully capture the complexity (or multidimensionality). Joint analysis of multiple disease-related phenotypes in genetic tests could potentially increase power to detect association of a disease with common SNPs (or genes). Gene-based tests are designed to identify genes containing multiple risk variants that individually are weakly associated with a univariate trait. We combined three multivariate association tests (O'Brien method, TATES, and MultiPhen) with two gene-based association tests (GATES and VEGAS) and compared performance (type I error and power) of six multivariate gene-based methods using simulated data. Data (n = 2000) for genetic sequence and correlated phenotypes were simulated by varying causal variant proportions and phenotype correlations for various scenarios. These simulations showed that two multivariate association tests (TATES and MultiPhen, but not O'Brien) paired with VEGAS have inflated type I error in all scenarios, while the three multivariate association tests paired with GATES have correct type I error. MultiPhen paired with GATES has higher power than competing methods if the correlations among phenotypes are low (r < 0.57). We applied these gene-based association methods to a GWAS dataset from the Alzheimer's Disease Genetics Consortium containing three neuropathological traits related to Alzheimer disease (neuritic plaque, neurofibrillary tangles, and cerebral amyloid angiopathy) measured in 3500 autopsied brains. Gene-level significant evidence (P < 2.7 × 10) was identified in a region containing three contiguous genes (TRAPPC12, TRAPPC12-AS1, ADI1) using O'Brien and VEGAS. Gene-wide significant associations were not observed in univariate gene-based tests.
复杂疾病通常与多种相关表型相关联,而对综合评分或疾病状态的分析可能无法完全捕捉到复杂性(或多维性)。在遗传测试中联合分析多个与疾病相关的表型可能会增加检测常见单核苷酸多态性(或基因)与疾病关联的能力。基于基因的测试旨在识别包含多个风险变异的基因,这些变异单独与单变量特征弱相关。我们将三种多变量关联测试(O'Brien 方法、TATES 和 MultiPhen)与两种基于基因的关联测试(GATES 和 VEGAS)相结合,并使用模拟数据比较了六种多变量基于基因的方法的性能(I 型错误和功效)。通过改变因果变异比例和各种情况下表型相关性,模拟了遗传序列和相关表型的数据(n=2000)。这些模拟表明,在所有情况下,两种多变量关联测试(TATES 和 MultiPhen,但不是 O'Brien)与 VEGAS 结合使用会导致 I 型错误膨胀,而三种多变量关联测试与 GATES 结合使用则具有正确的 I 型错误。如果表型之间的相关性较低(r<0.57),则 MultiPhen 与 GATES 结合使用的功效高于竞争方法。我们将这些基于基因的关联方法应用于来自阿尔茨海默病遗传学联合会的 GWAS 数据集,该数据集包含三个与阿尔茨海默病相关的神经病理学特征(神经纤维缠结、神经原纤维缠结和脑淀粉样血管病),这些特征在 3500 个尸检大脑中进行了测量。使用 O'Brien 和 VEGAS 在包含三个连续基因(TRAPPC12、TRAPPC12-AS1、ADI1)的区域中鉴定出基因水平显著证据(P<2.7×10)。在单变量基于基因的测试中未观察到基因广泛显著关联。