Xiang Ruidong, Fang Lingzhao, Liu Shuli, Liu George E, Tenesa Albert, Gao Yahui, Mason Brett A, Chamberlain Amanda J, Goddard Michael E
Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC 3083, Australia.
The School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia.
PNAS Nexus. 2025 Jul 2;4(7):pgaf208. doi: 10.1093/pnasnexus/pgaf208. eCollection 2025 Jul.
To complete the genome-to-phenome map, transcriptome-wide association studies (TWAS) are performed to correlate genetically predicted gene expression with observed phenotypic measurements. However, the relatively small training population assayed with gene expression could limit the accuracy of TWAS. We propose genetic score omics regression (GSOR) correlating observed gene expression with genetically predicted phenotype, i.e. estimated breeding values (EBVs) in agriculture or polygenic score (PGS) in medicine. The score, calculated using variants near genes with assayed expression (-EBV or -PGS), provides a powerful association test between effects on gene expression and the trait. In simulated and real data, GSOR outperforms TWAS in detecting causal/informative genes. We applied GSOR to transcriptomes of 16 tissues ( ∼ 5,000) and 37 traits in ∼120,000 cattle and conducted multitrait meta-analyses of omics-associations (MTAO). We found that, on average, each significant gene expression and splicing mediates -genetic effects on 8-10 traits. Many prioritized genes by GSOR and MTAO can be verified by Mendelian randomization analysis and show significantly reduced d/d, suggesting elevated evolutionary constraint for these genes. Using multiple methods, we detect expression levels of genes and/or RNA splicing events underlying previously thought single-gene loci to influence multiple traits. For example, the expression and RNA splicing of from multiple tissues regulated milk production, mastitis, gestation length, temperament, and stature. Also, gene expression and splicing of (Histo-blood group) and (acetylcholinesterase, Cartwright blood group) affected protein concentration and mastitis, respectively. Taken together, our work provides new methods and biological insights for prioritizing informative omics-phenotype associations in mammals.
为了完成从基因组到表型组的图谱构建,需进行全转录组关联研究(TWAS),以将基因预测的基因表达与观察到的表型测量值相关联。然而,用于基因表达检测的训练群体相对较小,可能会限制TWAS的准确性。我们提出了基因评分组学回归(GSOR)方法,将观察到的基因表达与基因预测的表型相关联,即农业中的估计育种值(EBV)或医学中的多基因评分(PGS)。该评分通过使用表达检测基因附近的变异(-EBV或-PGS)计算得出,为基因表达效应与性状之间提供了强大的关联检验。在模拟数据和真实数据中,GSOR在检测因果/信息基因方面优于TWAS。我们将GSOR应用于约120,000头牛的16种组织(约5,000个)的转录组和37个性状,并进行了组学关联多性状荟萃分析(MTAO)。我们发现,平均而言,每个显著的基因表达和剪接介导了对8至10个性状的遗传效应。许多通过GSOR和MTAO优先排序的基因可以通过孟德尔随机化分析得到验证,并且显示出显著降低的d/d,这表明这些基因的进化约束增加。我们使用多种方法检测了先前认为是单基因位点影响多个性状的基因表达水平和/或RNA剪接事件。例如,来自多个组织的 的表达和RNA剪接调节了产奶量、乳腺炎、妊娠期长度、性情和体型。此外, (组织血型)和 (乙酰胆碱酯酶,卡特赖特血型)的基因表达和剪接分别影响了蛋白质浓度和乳腺炎。综上所述,我们的工作为在哺乳动物中优先排序信息丰富的组学-表型关联提供了新方法和生物学见解。