Verily Life Sciences, San Francisco, CA, USA.
Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Nat Commun. 2023 Sep 5;14(1):5419. doi: 10.1038/s41467-023-41185-x.
Recently, large scale genomic projects such as All of Us and the UK Biobank have introduced a new research paradigm where data are stored centrally in cloud-based Trusted Research Environments (TREs). To characterize the advantages and drawbacks of different TRE attributes in facilitating cross-cohort analysis, we conduct a Genome-Wide Association Study of standard lipid measures using two approaches: meta-analysis and pooled analysis. Comparison of full summary data from both approaches with an external study shows strong correlation of known loci with lipid levels (R ~ 83-97%). Importantly, 90 variants meet the significance threshold only in the meta-analysis and 64 variants are significant only in pooled analysis, with approximately 20% of variants in each of those groups being most prevalent in non-European, non-Asian ancestry individuals. These findings have important implications, as technical and policy choices lead to cross-cohort analyses generating similar, but not identical results, particularly for non-European ancestral populations.
最近,像 All of Us 和 UK Biobank 这样的大规模基因组项目引入了一种新的研究模式,即数据集中存储在基于云的可信研究环境(TRE)中。为了描述不同 TRE 属性在促进跨队列分析方面的优缺点,我们采用两种方法对标准脂质测量进行全基因组关联研究:荟萃分析和汇总分析。将两种方法的完整汇总数据与外部研究进行比较,结果显示已知基因座与脂质水平具有很强的相关性(R≈83-97%)。重要的是,只有荟萃分析中的 90 个变体达到了显著性阈值,而 64 个变体仅在汇总分析中显著,其中每组大约有 20%的变体在非欧洲、非亚洲血统的个体中最为常见。这些发现具有重要意义,因为技术和政策选择导致跨队列分析产生相似但不完全相同的结果,特别是对于非欧洲祖先人群。