Suppr超能文献

一种用于确定牛复杂性状候选因果基因优先级的综合方法。

An integrative approach to prioritize candidate causal genes for complex traits in cattle.

作者信息

Ghoreishifar Mohammad, Macleod Iona M, Chamberlain Amanda J, Liu Zhiqian, Lopdell Thomas J, Littlejohn Mathew D, Xiang Ruidong, Pryce Jennie E, Goddard Michael E

机构信息

Agriculture Victoria Research, AgriBio Centre for AgriBioscience, Bundoora, Victoria, Australia.

School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, Australia.

出版信息

PLoS Genet. 2025 May 30;21(5):e1011492. doi: 10.1371/journal.pgen.1011492. eCollection 2025 May.

Abstract

Genome-wide association studies (GWAS) have identified many quantitative trait loci (QTL) associated with complex traits, predominantly in non-coding regions, posing challenges in pinpointing the causal variants and their target genes. Three types of evidence can help identify the gene through which QTL acts: (1) proximity to the most significant GWAS variant, (2) correlation of gene expression with the trait, and (3) the gene's physiological role in the trait. However, there is still uncertainty about the success of these methods in identifying the correct genes. Here, we test the ability of these methods in a comparatively simple series of traits associated with the concentration of polar lipids in milk. We conducted single-trait GWAS for 14 million imputed variants and 56 individual milk polar lipid (PL) phenotypes in 336 cows. A multi-trait meta-analysis of GWAS identified 10,063 significant SNPs at FDR ≤ 10% (P ≤ 7.15E-5). Transcriptome data from blood (12.5K genes, 143 cows) and mammary tissue (~12.2K genes, 169 cows) were analyzed using the genetic score omics regression (GSOR) method. This method links observed gene expression to genetically predicted phenotypes and was used to find associations between gene expression and 56 PL phenotypes. GSOR identified 2,186 genes in blood and 1,404 in mammary tissue associated with at least one PL phenotype (FDR ≤ 1%). We partitioned the genome into non-overlapping windows of 100 Kb to test for overlap between GSOR-identified genes and GWAS signals. We found a significant overlap between these two datasets, indicating that GSOR-significant genes were more likely to be located within 100 Kb windows that include GWAS signals than those that do not (P = 0.01; odds ratio = 1.47). These windows included 70 significant genes expressed in mammary tissue and 95 in blood. Compared to all expressed genes in each tissue, these genes were enriched for lipid metabolism gene ontology (GO). That is, seven of the 70 significant mammary transcriptome genes (P < 0.01; odds ratio = 3.98) and five of the 95 significant blood genes (P < 0.10; odds ratio = 2.24) were involved in lipid metabolism GO. The candidate causal genes include DGAT1, ACSM5, SERINC5, ABHD3, CYP2U1, PIGL, ARV1, SMPD5, and NPC2, with some overlap between the two tissues. The overlap between GWAS, GSOR, and GO analyses suggests that together, these methods are more likely to identify genes mediating QTL, though their power remains limited, as reflected by modest odds ratios. Larger sample sizes would enhance the power of these analyses, but issues like linkage disequilibrium would remain.

摘要

全基因组关联研究(GWAS)已经确定了许多与复杂性状相关的数量性状基因座(QTL),主要位于非编码区域,这给确定因果变异及其靶基因带来了挑战。有三种类型的证据有助于确定QTL作用的基因:(1)与最显著的GWAS变异的接近程度,(2)基因表达与性状的相关性,以及(3)基因在该性状中的生理作用。然而,这些方法在确定正确基因方面的成功率仍存在不确定性。在这里,我们在一系列与牛奶中极性脂质浓度相关的相对简单的性状中测试了这些方法的能力。我们对336头奶牛中约1400万个推算变异和56种个体牛奶极性脂质(PL)表型进行了单性状GWAS。GWAS的多性状荟萃分析在错误发现率(FDR)≤10%(P≤7.15E-5)时确定了10063个显著的单核苷酸多态性(SNP)。使用遗传评分组学回归(GSOR)方法分析了来自血液(约12500个基因,143头奶牛)和乳腺组织(约12200个基因,169头奶牛)的转录组数据。该方法将观察到的基因表达与遗传预测的表型联系起来,并用于发现基因表达与56种PL表型之间的关联。GSOR在血液中鉴定出2186个基因,在乳腺组织中鉴定出1404个基因与至少一种PL表型相关(FDR≤1%)。我们将基因组划分为100 kb的非重叠窗口,以测试GSOR鉴定的基因与GWAS信号之间的重叠情况。我们发现这两个数据集之间存在显著重叠,这表明GSOR显著的基因比那些不包括GWAS信号的基因更有可能位于包含GWAS信号的100 kb窗口内(P = 0.01;优势比 = 1.47)。这些窗口包括70个在乳腺组织中表达的显著基因和95个在血液中表达的显著基因。与每个组织中所有表达的基因相比,这些基因在脂质代谢基因本体(GO)中富集。也就是说,70个显著的乳腺转录组基因中有7个(P < 0.01;优势比 = 3.98)和95个显著的血液基因中有5个(P < 0.10;优势比 = 2.24)参与了脂质代谢GO。候选因果基因包括DGAT1、ACSM5、SERINC5、ABHD3、CYP2U1、PIGL、ARV1、SMPD5和NPC2,两个组织之间存在一些重叠。GWAS、GSOR和GO分析之间的重叠表明,这些方法结合起来更有可能识别介导QTL的基因,尽管它们的能力仍然有限,这从适度的优势比中可以反映出来。更大的样本量将增强这些分析的能力,但连锁不平衡等问题仍然存在。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8423/12158001/7005ab92f338/pgen.1011492.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验