Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland.
Bioinformatics. 2014 Jul 15;30(14):2026-34. doi: 10.1093/bioinformatics/btu140. Epub 2014 Mar 24.
A typical genome-wide association study searches for associations between single nucleotide polymorphisms (SNPs) and a univariate phenotype. However, there is a growing interest to investigate associations between genomics data and multivariate phenotypes, for example, in gene expression or metabolomics studies. A common approach is to perform a univariate test between each genotype-phenotype pair, and then to apply a stringent significance cutoff to account for the large number of tests performed. However, this approach has limited ability to uncover dependencies involving multiple variables. Another trend in the current genetics is the investigation of the impact of rare variants on the phenotype, where the standard methods often fail owing to lack of power when the minor allele is present in only a limited number of individuals.
We propose a new statistical approach based on Bayesian reduced rank regression to assess the impact of multiple SNPs on a high-dimensional phenotype. Because of the method's ability to combine information over multiple SNPs and phenotypes, it is particularly suitable for detecting associations involving rare variants. We demonstrate the potential of our method and compare it with alternatives using the Northern Finland Birth Cohort with 4702 individuals, for whom genome-wide SNP data along with lipoprotein profiles comprising 74 traits are available. We discovered two genes (XRCC4 and MTHFD2L) without previously reported associations, which replicated in a combined analysis of two additional cohorts: 2390 individuals from the Cardiovascular Risk in Young Finns study and 3659 individuals from the FINRISK study.
R-code freely available for download at http://users.ics.aalto.fi/pemartti/gene_metabolome/.
典型的全基因组关联研究旨在寻找单核苷酸多态性 (SNP) 与单变量表型之间的关联。然而,人们越来越感兴趣的是研究基因组学数据与多变量表型之间的关联,例如在基因表达或代谢组学研究中。一种常见的方法是在每对基因型-表型之间进行单变量检验,然后应用严格的显著性截止值来考虑进行的大量检验。然而,这种方法发现涉及多个变量的依赖关系的能力有限。当前遗传学的另一个趋势是研究罕见变异对表型的影响,由于标准方法在少数个体中存在次要等位基因时,由于缺乏效力,通常会失败。
我们提出了一种新的基于贝叶斯降秩回归的统计方法,用于评估多个 SNP 对高维表型的影响。由于该方法能够在多个 SNP 和表型之间结合信息,因此特别适合检测涉及罕见变异的关联。我们使用包含 74 个特征的脂蛋白谱的 4702 个人的芬兰北部出生队列来证明我们方法的潜力,并将其与替代方法进行比较。我们发现了两个以前没有报道过关联的基因(XRCC4 和 MTHFD2L),在对另外两个队列(心血管风险在年轻芬兰人中的 2390 个人和 FINRISK 研究中的 3659 个人)的综合分析中得到了复制。
可在 http://users.ics.aalto.fi/pemartti/gene_metabolome/ 下载免费的 R 代码。