Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA.
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Genes (Basel). 2021 Jul 8;12(7):1049. doi: 10.3390/genes12071049.
Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed.
To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including = 229 African American and = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African ( = 27,955) and Hispanic/Latino ( = 28,324) ancestry participants.
Our results revealed 24 suggestive signals ( < 1 × 10) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study ( = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN.
These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.
尽管大多数基因座的靶基因仍未知,但已有数千个遗传变异与血液学特征相关。此外,在非洲裔和西班牙裔/拉丁裔人群中进行的分析有限;这些人群中更常见的与血液学特征相关的变异可能已经被遗漏。
为了推导出基因表达预测模型,我们使用了来自动脉粥样硬化多民族研究(MESA,包括 229 名非裔美国人和 381 名西班牙裔/拉丁裔参与者,单核细胞)和抑郁基因和网络研究(DGN,922 名欧洲血统参与者,全血)的按祖先分层数据集。然后,我们在非洲裔(=27955)和西班牙裔/拉丁裔(=28324)血统参与者中对血小板计数、血红蛋白、血细胞比容和白细胞计数进行了全转录组关联研究(TWAS)。
我们的结果揭示了 24 个提示信号(<1×10),这些信号与已知的 GWAS 确定的变异条件不同,并在来自英国生物库的欧洲血统参与者中成功复制了这些信号。我们发现,使用更大的 DGN 参考面板,在一个独立的非裔美国人队列(动脉粥样硬化遗传流行病学网络(GENOA)研究(=802),淋巴母细胞系)中,预测和测量的基因表达之间的相关性略有提高;然而,一些基因使用 MESA 可以很好地预测,但不能使用 DGN。
这些分析表明,在不同人群中进行 TWAS 和其他遗传分析以及在选择 TWAS 参考面板时平衡样本量和祖先背景匹配的重要性。