Bai Zhonghao, Gholipourshahraki Tahereh, Shrestha Merina, Hjelholt Astrid, Hu Sile, Kjolby Mads, Rohde Palle Duun, Sørensen Peter
Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark.
Department of Biomedicine, Aarhus University, Aarhus, Denmark.
BMC Genomics. 2024 Dec 23;25(1):1236. doi: 10.1186/s12864-024-11026-2.
Gene set tests can pinpoint genes and biological pathways that exert small to moderate effects on complex diseases like Type 2 Diabetes (T2D). By aggregating genetic markers based on biological information, these tests can enhance the statistical power needed to detect genetic associations.
Our goal was to develop a gene set test utilizing Bayesian Linear Regression (BLR) models, which account for both linkage disequilibrium (LD) and the complex genetic architectures intrinsic to diseases, thereby increasing the detection power of genetic associations. Through a series of simulation studies, we demonstrated how the efficacy of BLR derived gene set tests is influenced by several factors, including the proportion of causal markers, the size of gene sets, the percentage of genetic variance explained by the gene set, and the genetic architecture of the traits. By using KEGG pathways, eQTLs, and regulatory elements as different kinds of gene sets with T2D results, we also assessed the performance of gene set tests in explaining more about real phenotypes.
Comparing our method with other approaches, such as the gold standard MAGMA (Multi-marker Analysis of Genomic Annotation) approach, our BLR gene set test showed superior performance. Combining performance of our method in simulated and real phenotypes, this suggests that our BLR-based approach could more accurately identify genes and biological pathways underlying complex diseases.
基因集测试能够精准定位对2型糖尿病(T2D)等复杂疾病产生小到中等影响的基因和生物途径。通过基于生物学信息聚合遗传标记,这些测试可以增强检测遗传关联所需的统计效力。
我们的目标是开发一种利用贝叶斯线性回归(BLR)模型的基因集测试,该模型既能考虑连锁不平衡(LD),又能兼顾疾病内在的复杂遗传结构,从而提高遗传关联的检测能力。通过一系列模拟研究,我们展示了BLR衍生的基因集测试的效能是如何受到几个因素影响的,这些因素包括因果标记的比例、基因集的大小、基因集所解释的遗传方差的百分比以及性状的遗传结构。通过将KEGG途径、eQTL和调控元件用作具有T2D结果的不同类型基因集,我们还评估了基因集测试在更深入解释真实表型方面的性能。
将我们的方法与其他方法(如金标准MAGMA(基因组注释多标记分析)方法)进行比较,我们的BLR基因集测试表现出卓越的性能。结合我们的方法在模拟和真实表型中的性能,这表明我们基于BLR的方法能够更准确地识别复杂疾病背后的基因和生物途径。