Suppr超能文献

整合 eQTL 和机器学习方法解析遗传调控网络中与皮棉产量具有多效性效应的因果基因

Integration of eQTL and machine learning to dissect causal genes with pleiotropic effects in genetic regulation networks of seed cotton yield.

机构信息

Zhejiang Provincial Key Laboratory of Crop Genetic Resources, The Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 300058, China; Hainan Institute of Zhejiang University, Building 11, Yonyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya 572025, China.

Zhejiang Provincial Key Laboratory of Crop Genetic Resources, The Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 300058, China.

出版信息

Cell Rep. 2023 Sep 26;42(9):113111. doi: 10.1016/j.celrep.2023.113111. Epub 2023 Sep 6.

Abstract

The dissection of a gene regulatory network (GRN) that complements the genome-wide association study (GWAS) locus and the crosstalk underlying multiple agronomical traits remains a major challenge. In this study, we generate 558 transcriptional profiles of lint-bearing ovules at one day post-anthesis from a selective core cotton germplasm, from which 12,207 expression quantitative trait loci (eQTLs) are identified. Sixty-six known phenotypic GWAS loci are colocalized with 1,090 eQTLs, forming 38 functional GRNs associated predominantly with seed yield. Of the eGenes, 34 exhibit pleiotropic effects. Combining the eQTLs within the seed yield GRNs significantly increases the portion of narrow-sense heritability. The extreme gradient boosting (XGBoost) machine learning approach is applied to predict seed cotton yield phenotypes on the basis of gene expression. Top-ranking eGenes (NF-YB3, FLA2, and GRDP1) derived with pleiotropic effects on yield traits are validated, along with their potential roles by correlation analysis, domestication selection analysis, and transgenic plants.

摘要

解析一个与全基因组关联研究(GWAS)位点互补的基因调控网络(GRN)以及多个农艺性状的相互作用仍然是一个主要挑战。在这项研究中,我们从一个有选择性的核心棉花种质中生成了 558 个在授粉后一天的有绒棉胚珠的转录谱,从中鉴定出 12207 个表达数量性状基因座(eQTLs)。66 个已知的表型 GWAS 位点与 1090 个 eQTL 共定位,形成 38 个主要与种子产量相关的功能 GRN。在 eGenes 中,有 34 个表现出多效性。在种子产量 GRN 中结合 eQTLs 显著增加了狭义遗传力的比例。极端梯度提升(XGBoost)机器学习方法被应用于基于基因表达预测棉花种子产量表型。通过相关性分析、驯化选择分析和转基因植物,对与产量性状表现出多效性的排名靠前的 eGenes(NF-YB3、FLA2 和 GRDP1)进行了验证,以及它们的潜在作用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验