Mägi Reedik, Suleimanov Yury V, Clarke Geraldine M, Kaakinen Marika, Fischer Krista, Prokopenko Inga, Morris Andrew P
Estonian Genome Center, University of Tartu, Tartu, Estonia.
Computation-based Science and Technology Research Center, Cyprus Institute, Nicosia, Cyprus.
BMC Bioinformatics. 2017 Jan 11;18(1):25. doi: 10.1186/s12859-016-1437-3.
Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite the fact that many diseases and quantitative traits are correlated with each other, and often measured in the same sample of individuals. Multivariate analyses of correlated phenotypes have been demonstrated, by simulation, to increase power to detect association with SNPs, and thus may enable improved detection of novel loci contributing to diseases and quantitative traits.
We have developed the SCOPA software to enable GWAS analysis of multiple correlated phenotypes. The software implements "reverse regression" methodology, which treats the genotype of an individual at a SNP as the outcome and the phenotypes as predictors in a general linear model. SCOPA can be applied to quantitative traits and categorical phenotypes, and can accommodate imputed genotypes under a dosage model. The accompanying META-SCOPA software enables meta-analysis of association summary statistics from SCOPA across GWAS. Application of SCOPA to two GWAS of high-and low-density lipoprotein cholesterol, triglycerides and body mass index, and subsequent meta-analysis with META-SCOPA, highlighted stronger association signals than univariate phenotype analysis at established lipid and obesity loci. The META-SCOPA meta-analysis also revealed a novel signal of association at genome-wide significance for triglycerides mapping to GPC5 (lead SNP rs71427535, p = 1.1x10), which has not been reported in previous large-scale GWAS of lipid traits.
The SCOPA and META-SCOPA software enable discovery and dissection of multiple phenotype association signals through implementation of a powerful reverse regression approach.
单核苷酸多态性(SNP)的全基因组关联研究(GWAS)已成功识别出对多种复杂人类疾病和数量性状具有遗传效应的基因座。传统的GWAS分析方法是分别考虑每个表型,尽管许多疾病和数量性状相互关联,且通常在同一组个体样本中进行测量。通过模拟已证明,对相关表型进行多变量分析可提高检测与SNP关联的效能,从而可能有助于更好地检测导致疾病和数量性状的新基因座。
我们开发了SCOPA软件,用于对多个相关表型进行GWAS分析。该软件采用“反向回归”方法,在一般线性模型中将个体在某个SNP处的基因型作为结果,将表型作为预测因子。SCOPA可应用于数量性状和分类表型,并可在剂量模型下处理估算的基因型。配套的META-SCOPA软件可对来自SCOPA的全基因组关联研究的关联汇总统计数据进行荟萃分析。将SCOPA应用于高密度和低密度脂蛋白胆固醇、甘油三酯和体重指数的两项全基因组关联研究,并随后使用META-SCOPA进行荟萃分析,结果表明,在已确定的脂质和肥胖基因座处,与单变量表型分析相比,关联信号更强。META-SCOPA荟萃分析还揭示了一个在全基因组水平上具有显著性的甘油三酯关联新信号,该信号定位于GPC5(领先SNP rs71427535,p = 1.1x10),这在先前关于脂质性状的大规模全基因组关联研究中尚未报道。
SCOPA和META-SCOPA软件通过实施强大的反向回归方法,能够发现和剖析多个表型关联信号。