Suppr超能文献

基于 k-mer 的泛基因组方法对小麦种子贮藏蛋白基因进行编目,以促进基因型到表型的预测和改善其用途品质。

A k-mer-based pangenome approach for cataloging seed-storage-protein genes in wheat to facilitate genotype-to-phenotype prediction and improvement of end-use quality.

机构信息

Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.

Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China.

出版信息

Mol Plant. 2024 Jul 1;17(7):1038-1053. doi: 10.1016/j.molp.2024.05.006. Epub 2024 May 24.

Abstract

Wheat is a staple food for more than 35% of the world's population, with wheat flour used to make hundreds of baked goods. Superior end-use quality is a major breeding target; however, improving it is especially time-consuming and expensive. Furthermore, genes encoding seed-storage proteins (SSPs) form multi-gene families and are repetitive, with gaps commonplace in several genome assemblies. To overcome these barriers and efficiently identify superior wheat SSP alleles, we developed "PanSK" (Pan-SSP k-mer) for genotype-to-phenotype prediction based on an SSP-based pangenome resource. PanSK uses 29-mer sequences that represent each SSP gene at the pangenomic level to reveal untapped diversity across landraces and modern cultivars. Genome-wide association studies with k-mers identified 23 SSP genes associated with end-use quality that represent novel targets for improvement. We evaluated the effect of rye secalin genes on end-use quality and found that removal of ω-secalins from 1BL/1RS wheat translocation lines is associated with enhanced end-use quality. Finally, using machine-learning-based prediction inspired by PanSK, we predicted the quality phenotypes with high accuracy from genotypes alone. This study provides an effective approach for genome design based on SSP genes, enabling the breeding of wheat varieties with superior processing capabilities and improved end-use quality.

摘要

小麦是全球超过 35%人口的主食,小麦面粉被用于制作数百种烘焙食品。优质的最终用途质量是主要的育种目标;然而,改善它尤其耗时且昂贵。此外,编码种子贮藏蛋白 (SSP) 的基因形成多基因家族,并且具有重复性,在几个基因组组装中常见间隙。为了克服这些障碍并有效地鉴定优质小麦 SSP 等位基因,我们基于基于 SSP 的泛基因组资源开发了“PanSK”(Pan-SSP k-mer),用于基于基因型的表型预测。PanSK 使用 29 -mer 序列代表泛基因组水平上的每个 SSP 基因,揭示了在地方品种和现代品种中的未开发多样性。与 k-mer 的全基因组关联研究鉴定出与最终用途质量相关的 23 个 SSP 基因,这些基因代表了改进的新目标。我们评估了黑麦醇溶蛋白基因对最终用途质量的影响,发现从 1BL/1RS 小麦易位系中去除 ω-醇溶蛋白与改善最终用途质量有关。最后,我们使用受 PanSK 启发的基于机器学习的预测,仅从基因型就可以高精度地预测质量表型。这项研究为基于 SSP 基因的基因组设计提供了一种有效方法,能够培育出具有优异加工性能和改善最终用途质量的小麦品种。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验