Suppr超能文献

基于英国生物库和芬兰人群遗传研究的生物库规模数据集的已知关联复制和新关联识别:一项调查研究

Replication of Known and Identification of Novel Associations in Biobank-Scale Datasets: A Survey Using UK Biobank and FinnGen.

机构信息

Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, 199034 St. Petersburg, Russia.

出版信息

Genes (Basel). 2024 Jul 17;15(7):931. doi: 10.3390/genes15070931.

Abstract

Over the last two decades, numerous genome-wide association studies (GWAS) have been performed to unveil the genetic architecture of human complex traits. Despite multiple efforts aimed at the trans-biobank integration of GWAS results, no systematic analysis of the variant-level properties affecting the replication of known associations (or identifying novel ones) in genome-wide meta-analysis has yet been performed using biobank-scale data. To address this issue, we performed a systematic comparison of GWAS summary statistics for 679 complex traits in the UK Biobank (UKB) and FinnGen (FG) cohorts. We identified 37,148 index variants with genome-wide associations with at least one trait in either cohort or in the meta-analysis, only 3528 (9.5%) of which were shared between UKB and FG. Nearly twice as many variants (6577) were replicated in another dataset at the significance level adjusted for the number of variants selected for replication. However, as many as 9230 loci failed to be replicated. Moreover, as many as 5813 loci were observed as significant associations only in meta-analysis results, highlighting the importance of trans-biobank meta-analysis efforts. We showed that variants that failed to replicate in UKB or FG tend to correspond to rare, less pleiotropic variants with lower effect sizes and lower LD score values. Genome-wide associations specific to meta-analysis were also enriched in low-effect variants; however, such variants tended to be more common and have more consistent frequencies between populations. Taken together, our results show a relatively high rate of non-replication of genome-wide associations in the studied cohorts and highlight both widely appreciated and less acknowledged properties of the associations affecting their identification and replication.

摘要

在过去的二十年中,已经进行了许多全基因组关联研究(GWAS),以揭示人类复杂特征的遗传结构。尽管已经做出了多项努力,旨在将 GWAS 结果进行跨生物库整合,但尚未使用生物库规模的数据对影响全基因组荟萃分析中已知关联的复制(或识别新关联)的变异水平特性进行系统分析。为了解决这个问题,我们对 UK Biobank(UKB)和 FinnGen(FG)队列中的 679 种复杂特征的 GWAS 汇总统计数据进行了系统比较。我们确定了 37148 个索引变体,它们在 UKB 和 FG 中的至少一个特征或荟萃分析中具有全基因组关联,其中只有 3528 个(9.5%)在 UKB 和 FG 之间共享。将近两倍的变体(6577 个)在另一个数据集在调整为用于复制的变体数量的显著性水平上得到了复制。然而,多达 9230 个基因座未能得到复制。此外,多达 5813 个基因座仅在荟萃分析结果中观察到显著关联,这突显了跨生物库荟萃分析工作的重要性。我们表明,在 UKB 或 FG 中未能复制的变体往往对应于罕见、多效性较低、效应大小较低和 LD 评分值较低的变体。专门针对荟萃分析的全基因组关联也富集了低效应变体;然而,这种变体往往更为常见,并且在不同人群之间具有更一致的频率。总的来说,我们的研究结果表明,在所研究的队列中,全基因组关联的非复制率相对较高,并突出了影响其识别和复制的关联的广泛认可和较少认可的特性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d69/11275374/7d8258f5de69/genes-15-00931-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验