McGuirl Melissa R, Smith Samuel Pattillo, Sandstede Björn, Ramachandran Sohini
Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912.
Center for Computational Molecular Biology, Brown University, Providence, Rhode Island 02912.
Genetics. 2020 Jun;215(2):511-529. doi: 10.1534/genetics.120.303096. Epub 2020 Apr 3.
Emerging large-scale biobanks pairing genotype data with phenotype data present new opportunities to prioritize shared genetic associations across multiple phenotypes for molecular validation. Past research, by our group and others, has shown gene-level tests of association produce biologically interpretable characterization of the genetic architecture of a given phenotype. Here, we present a new method, Ward clustering to identify Internal Node branch length outliers using Gene Scores (WINGS), for identifying shared genetic architecture among phenotypes. The objective of WINGS is to identify groups of phenotypes, or "clusters," sharing a core set of genes enriched for mutations in cases. We validate WINGS using extensive simulation studies and then combine gene-level association tests with WINGS to identify shared genetic architecture among 81 case-control and seven quantitative phenotypes in 349,468 European-ancestry individuals from the UK Biobank. We identify eight prioritized phenotype clusters and recover multiple published gene-level associations within prioritized clusters.
新兴的将基因型数据与表型数据配对的大规模生物样本库为优先考虑跨多种表型的共享基因关联以进行分子验证提供了新机会。我们团队和其他团队过去的研究表明,基因水平的关联测试能够对给定表型的遗传结构进行生物学上可解释的表征。在此,我们提出一种新方法,即利用基因分数通过沃德聚类识别内部节点分支长度异常值(WINGS),用于识别表型之间共享的遗传结构。WINGS的目标是识别共享一组核心基因的表型组,即“簇”,这些基因在病例中富集突变。我们通过广泛的模拟研究验证了WINGS,然后将基因水平的关联测试与WINGS相结合,在来自英国生物样本库的349,468名欧洲血统个体中识别81种病例对照和7种定量表型之间共享的遗传结构。我们识别出八个优先表型簇,并在优先簇中重现了多个已发表的基因水平关联。