Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
Am J Hum Genet. 2023 Sep 7;110(9):1522-1533. doi: 10.1016/j.ajhg.2023.07.012. Epub 2023 Aug 21.
Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results.
人群规模的生物银行与电子健康记录数据相结合,为扩展人类遗传学知识和发现新的表型-基因型关联提供了巨大的机会。鉴于它们密集的表型数据,生物银行还可以促进在全表型范围内进行复制研究。在这里,我们介绍表型-基因型参考图谱(PGRM),这是一套来自 523 项 GWAS 出版物的 5879 个遗传关联,可以用于高通量复制实验。PGRM 表型被标准化为 phecode,确保了生物库之间的互操作性。我们将 PGRM 应用于来自四个独立生物库的五个特定祖先的队列中,发现了广泛表型中稳健复制的证据。我们展示了如何使用 PGRM 来检测数据损坏,并对全表型研究的参数进行实证评估。最后,我们使用 PGRM 来探索与 GWAS 结果可重复性相关的因素。