Suppr超能文献

一种可扩展的变分方法,利用全基因组关联研究(GWAS)汇总统计数据来表征数千种人类疾病和复杂性状中的多效性成分。

A scalable variational approach to characterize pleiotropic components across thousands of human diseases and complex traits using GWAS summary statistics.

作者信息

Zhang Zixuan, Jung Junghyun, Kim Artem, Suboc Noah, Gazal Steven, Mancuso Nicholas

机构信息

Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California.

Department of Quantitative and Computational Biology, University of Southern California.

出版信息

medRxiv. 2023 Mar 29:2023.03.27.23287801. doi: 10.1101/2023.03.27.23287801.

Abstract

Genome-wide association studies (GWAS) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes, while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N=420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (P=2.58E-10), and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest novel shared etiologies between rheumatoid arthritis and periodontal condition, in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWAS.

摘要

针对数千种性状的全基因组关联研究(GWAS)揭示了性状相关基因变异普遍存在的多效性。虽然已经提出了一些方法来表征跨表型组的多效性成分,但将这些方法扩展到超大规模生物样本库一直具有挑战性。在此,我们提出了FactorGo,这是一种可扩展的变分因子分析模型,用于使用生物样本库GWAS汇总数据来识别和表征多效性成分。在广泛的模拟中,我们观察到FactorGo在捕获跨表型的潜在多效性因子方面优于当前最先进的(无模型)方法tSVD,同时保持相似的计算成本。我们应用FactorGo从欧洲血统的泛英国生物样本库个体(N = 420,531)中测量的2483种表型的GWAS汇总数据中估计100个潜在的多效性因子。接下来,我们发现FactorGo得出的因子比tSVD识别出的因子更富集相关的组织特异性注释(P = 2.58E-10),并通过概括BMI的脑特异性富集以及生殖系统与肌肉骨骼生长之间的身高相关联系来验证我们的方法。最后,我们的分析表明类风湿性关节炎和牙周疾病之间存在新的共同病因,此外碱性磷酸酶是前列腺癌的候选预后生物标志物。总体而言,FactorGo提高了我们对数千个GWAS中共同病因的生物学理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ca2/10081403/5e8eba3a9f0f/nihpp-2023.03.27.23287801v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验