Norland Kristjan, Schaid Daniel J, Kullo Iftikhar J
Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA.
Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
HGG Adv. 2025 Mar 25;6(3):100427. doi: 10.1016/j.xhgg.2025.100427.
Functional genomic annotations can improve polygenic scores (PGS) within and between genetic ancestry groups. While general annotations are commonly used in PGS development, tissue- and cell-type-specific annotations derived from open chromatin and gene expression experiments may further enhance PGS for cardiometabolic traits. We developed PGS for 14 cardiometabolic traits in the UK Biobank using SBayesRC. We integrated GWAS summary statistics from FinnGen and GLGC with three annotation sources: (1) Baseline-LD model version 2.2 (general annotations), (2) cell-type-specific snATAC-seq peaks, and (3) tissue-specific eQTLs/sQTLs. We created PGS using two EUR LD reference panels (1.2 million [1.2M] HapMap3 variants and 7M imputed variants). Tissue- and cell-type-specific annotations showed stronger heritability enrichment than Baseline-LD annotations on average, particularly coronary snATAC-seq peaks and fine-mapped eQTLs. Without annotations, HapMap3 and 7M variant PGS performed similarly. However, with all annotations, 7M variant PGS outperformed HapMap3 variant PGS (8% average increase in relative performance in EUR). Compared to using no annotations, modeling Baseline-LD annotations improved performance by 5% for HapMap3 and 11% for 7M variant PGS, while modeling all annotations yielded improvements of 5% and 13%, respectively. Although annotations provided greater relative improvement for cross-ancestry prediction, they did not decrease the disparity in PGS performance between genetic ancestry groups. In conclusion, functional annotations improved PGS for cardiometabolic traits. Despite strong heritability enrichment, tissue- and cell-type-specific snATAC-seq and eQTL annotations provided marginal performance gains beyond general genomic annotations.
功能基因组注释可以改善不同遗传血统群体内部和群体之间的多基因分数(PGS)。虽然一般注释在PGS开发中常用,但源自开放染色质和基因表达实验的组织和细胞类型特异性注释可能会进一步提高心血管代谢性状的PGS。我们使用SBayesRC在英国生物银行中开发了14种心血管代谢性状的PGS。我们将来自芬兰基因库(FinnGen)和全球脂质遗传学联盟(GLGC)的全基因组关联研究(GWAS)汇总统计数据与三种注释来源进行整合:(1)基线连锁不平衡(LD)模型版本2.2(一般注释),(2)细胞类型特异性的单细胞染色质转座酶可及性测序(snATAC-seq)峰,以及(3)组织特异性的表达数量性状基因座(eQTL)/剪接数量性状基因座(sQTL)。我们使用两个欧洲人(EUR)LD参考面板(120万个[120M]HapMap3变体和700万个推算变体)创建了PGS。组织和细胞类型特异性注释平均显示出比基线LD注释更强的遗传力富集,特别是冠状动脉snATAC-seq峰和精细定位的eQTL。在没有注释的情况下,HapMap3和700万个变体的PGS表现相似。然而,在使用所有注释的情况下,700万个变体的PGS优于HapMap3变体的PGS(在欧洲人群中相对性能平均提高8%)。与不使用注释相比,对基线LD注释进行建模使HapMap3变体的PGS性能提高了5%,700万个变体的PGS性能提高了11%,而对所有注释进行建模分别使性能提高了5%和13%。尽管注释为跨血统预测提供了更大的相对改善,但它们并没有减少不同遗传血统群体之间PGS性能的差异。总之,功能注释改善了心血管代谢性状的PGS。尽管遗传力富集很强,但组织和细胞类型特异性的snATAC-seq和eQTL注释相比一般基因组注释仅提供了边际性能提升。