利用遗传图谱和机器学习来鉴定大麦泛基因组序列锚点。

Identifying barley pan-genome sequence anchors using genetic mapping and machine learning.

机构信息

Agriculture and Food, CSIRO, St Lucia, QLD, 4067, Australia.

Tasmanian Institute of Agriculture, University of Tasmania, Prospect, TAS, 7250, Australia.

出版信息

Theor Appl Genet. 2020 Sep;133(9):2535-2544. doi: 10.1007/s00122-020-03615-y. Epub 2020 May 24.

DOI:10.1007/s00122-020-03615-y

PMID:32448920

Abstract

We identified 1.844 million barley pan-genome sequence anchors from 12,306 genotypes using genetic mapping and machine learning. There is increasing evidence that genes from a given crop genotype are far to cover all genes in that species; thus, building more comprehensive pan-genomes is of great importance in genetic research and breeding. Obtaining a thousand-genotype scale pan-genome using deep-sequencing data is currently impractical for species like barley which has a huge and highly repetitive genome. To this end, we attempted to identify barley pan-genome sequence anchors from a large quantity of genotype-by-sequencing (GBS) datasets by combining genetic mapping and machine learning algorithms. Based on the GBS sequences from 11,166 domesticated and 1140 wild barley genotypes, we identified 1.844 million pan-genome sequence anchors. Of them, 532,253 were identified as presence/absence variation (PAV) tags. Through aligning these PAV tags to the genome of hulless barley genotype Zangqing320, our analysis resulted in a validation of 83.6% of them from the domesticated genotypes and 88.6% from the wild barley genotypes. Association analyses against flowering time, plant height and kernel size showed that the relative importance of the PAV and non-PAV tags varied for different traits. The pan-genome sequence anchors based on GBS tags can facilitate the construction of a comprehensive pan-genome and greatly assist various genetic studies including identification of structural variation, genetic mapping and breeding in barley.

摘要

我们使用遗传图谱和机器学习从 12306 个基因型中鉴定出 184.4 万个大麦泛基因组序列锚。越来越多的证据表明，给定作物基因型的基因远远不能覆盖该物种的所有基因；因此，构建更全面的泛基因组在遗传研究和育种中非常重要。对于像大麦这样基因组巨大且高度重复的物种，使用深度测序数据获得千个基因型规模的泛基因组目前是不切实际的。为此，我们试图通过结合遗传图谱和机器学习算法，从大量基因型测序（GBS）数据中鉴定大麦泛基因组序列锚。基于 11166 个驯化和 1140 个野生大麦基因型的 GBS 序列，我们鉴定出了 184.4 万个泛基因组序列锚。其中，532253 个被鉴定为存在/缺失变异（PAV）标记。通过将这些 PAV 标记与无壳大麦基因型 Zangqing320 的基因组进行比对，我们的分析验证了其中 83.6%来自驯化基因型，88.6%来自野生大麦基因型。对开花时间、株高和籽粒大小的关联分析表明，PAV 和非 PAV 标记的相对重要性因不同性状而异。基于 GBS 标记的泛基因组序列锚可以促进全面泛基因组的构建，并极大地协助大麦中的各种遗传研究，包括结构变异的鉴定、遗传图谱和育种。

相似文献

Identifying barley pan-genome sequence anchors using genetic mapping and machine learning.利用遗传图谱和机器学习来鉴定大麦泛基因组序列锚点。

Theor Appl Genet. 2020 Sep;133(9):2535-2544. doi: 10.1007/s00122-020-03615-y. Epub 2020 May 24.

The barley pan-genome reveals the hidden legacy of mutation breeding.大麦泛基因组揭示了诱变育种的隐藏遗产。

Nature. 2020 Dec;588(7837):284-289. doi: 10.1038/s41586-020-2947-8. Epub 2020 Nov 25.

High-resolution genetic mapping of maize pan-genome sequence anchors.玉米泛基因组序列锚定的高分辨率遗传图谱

Nat Commun. 2015 Apr 16;6:6914. doi: 10.1038/ncomms7914.

Assembly and analysis of a qingke reference genome demonstrate its close genetic relation to modern cultivated barley.组装和分析青稞参考基因组表明其与现代栽培大麦具有密切的遗传关系。

Plant Biotechnol J. 2018 Mar;16(3):760-770. doi: 10.1111/pbi.12826. Epub 2017 Oct 5.

Unlocking the secondary gene-pool of barley with next-generation sequencing.利用下一代测序技术解锁大麦的次级基因库。

Plant Biotechnol J. 2014 Oct;12(8):1122-31. doi: 10.1111/pbi.12219. Epub 2014 Jul 6.

Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach.利用新型双酶基因分型测序方法开发大麦和小麦的高密度遗传图谱。

PLoS One. 2012;7(2):e32253. doi: 10.1371/journal.pone.0032253. Epub 2012 Feb 28.

Transcriptomic and presence/absence variation in the barley genome assessed from multi-tissue mRNA sequencing and their power to predict phenotypic traits.基于多组织 mRNA 测序评估的大麦基因组转录组和存在/缺失变异及其预测表型性状的能力。

BMC Genomics. 2019 Oct 29;20(1):787. doi: 10.1186/s12864-019-6174-3.

Prospects of pan-genomics in barley.大麦泛基因组学的前景。

Theor Appl Genet. 2019 Mar;132(3):785-796. doi: 10.1007/s00122-018-3234-z. Epub 2018 Nov 16.

Genome wide association study of plant height and tiller number in hulless barley.全基因组关联研究 hulless barley 的株高和分蘖数。

PLoS One. 2021 Dec 2;16(12):e0260723. doi: 10.1371/journal.pone.0260723. eCollection 2021.

Construction of a high-density genetic map: genotyping by sequencing (GBS) to map purple seed coat color () in hulless barley.高密度遗传图谱的构建：通过测序进行基因分型（GBS）以定位裸大麦的紫色种皮颜色（）

Hereditas. 2018 Nov 17;155:37. doi: 10.1186/s41065-018-0072-6. eCollection 2018.

引用本文的文献

The developments and prospects of plant super-pangenomes: Demands, approaches, and applications.植物超级泛基因组的发展与前景：需求、方法及应用

Plant Commun. 2025 Feb 10;6(2):101230. doi: 10.1016/j.xplc.2024.101230. Epub 2024 Dec 24.

Detection and characterization of spike architecture based on deep learning and X-ray computed tomography in barley.基于深度学习和X射线计算机断层扫描技术对大麦穗形态结构的检测与表征

Plant Methods. 2023 Oct 27;19(1):115. doi: 10.1186/s13007-023-01096-w.

Genotyping by Sequencing Advancements in Barley.大麦测序基因分型进展

Front Plant Sci. 2022 Aug 8;13:931423. doi: 10.3389/fpls.2022.931423. eCollection 2022.

Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species.泛基因组作为加速利用不足作物品种培育的资源。

Int J Mol Sci. 2022 Feb 28;23(5):2671. doi: 10.3390/ijms23052671.

EORNA, a barley gene and transcript abundance database.EORNA，一个大麦基因和转录丰度数据库。

Sci Data. 2021 Mar 25;8(1):90. doi: 10.1038/s41597-021-00872-4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用遗传图谱和机器学习来鉴定大麦泛基因组序列锚点。

Identifying barley pan-genome sequence anchors using genetic mapping and machine learning.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献