National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China.
Sci China Life Sci. 2024 Jun;67(6):1133-1154. doi: 10.1007/s11427-023-2522-8. Epub 2024 Mar 29.
Detecting genes that affect specific traits (such as human diseases and crop yields) is important for treating complex diseases and improving crop quality. A genome-wide association study (GWAS) provides new insights and directions for understanding complex traits by identifying important single nucleotide polymorphisms. Many GWAS summary statistics data related to various complex traits have been gathered recently. Studies have shown that GWAS risk loci and expression quantitative trait loci (eQTLs) often have a lot of overlaps, which makes gene expression gradually become an important intermediary to reveal the regulatory role of GWAS. In this review, we review three types of gene-trait association detection methods of integrating GWAS summary statistics and eQTLs data, namely colocalization methods, transcriptome-wide association study-oriented approaches, and Mendelian randomization-related methods. At the theoretical level, we discussed the differences, relationships, advantages, and disadvantages of various algorithms in the three kinds of gene-trait association detection methods. To further discuss the performance of various methods, we summarize the significant gene sets that influence high-density lipoprotein, low-density lipoprotein, total cholesterol, and triglyceride reported in 16 studies. We discuss the performance of various algorithms using the datasets of the four lipid traits. The advantages and limitations of various algorithms are analyzed based on experimental results, and we suggest directions for follow-up studies on detecting gene-trait associations.
检测影响特定性状(如人类疾病和作物产量)的基因对于治疗复杂疾病和提高作物质量非常重要。全基因组关联研究(GWAS)通过鉴定重要的单核苷酸多态性,为理解复杂性状提供了新的见解和方向。最近已经收集了许多与各种复杂性状相关的 GWAS 汇总统计数据。研究表明,GWAS 风险位点和表达数量性状基因座(eQTLs)经常有很多重叠,这使得基因表达逐渐成为揭示 GWAS 调控作用的重要中介。在这篇综述中,我们回顾了整合 GWAS 汇总统计数据和 eQTLs 数据的三种基因-性状关联检测方法,即共定位方法、基于全转录组关联研究的方法和孟德尔随机化相关方法。在理论层面,我们讨论了三种基因-性状关联检测方法中各种算法的差异、关系、优势和劣势。为了进一步讨论各种方法的性能,我们总结了 16 项研究中报告的影响高密度脂蛋白、低密度脂蛋白、总胆固醇和甘油三酯的显著基因集。我们使用四个脂质特征数据集讨论了各种算法的性能。基于实验结果分析了各种算法的优缺点,并提出了检测基因-性状关联的后续研究方向。