Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, 06466, Seeland, Germany.
Center for Integrated Breeding Research (CiBreed), Georg-August-University Göttingen, 37075, Göttingen, Germany.
Theor Appl Genet. 2019 Mar;132(3):785-796. doi: 10.1007/s00122-018-3234-z. Epub 2018 Nov 16.
The concept of a pan-genome refers to intraspecific diversity in genome content and structure, encompassing both genes and intergenic space. Pan-genomic studies employ a combination of de novo sequence assembly and reference-based alignment to discover and genotype structural variants. The large size and complex structure of Triticeae genomes were for a long time an obstacle for genomic research in barley and its relatives. Now that a reference genome is available, computational pipelines for high-quality sequence assembly are in place, and sequence costs continue to drop, investigations into the structural diversity of the barley genome seem within reach. Here, we review the recent progress on pan-genomics in the model grass Brachypodium distachyon, and the cereal crops rice and maize, and devise a multi-tiered strategy for a pan-genome project in barley. Our design involves: (1) the construction of high-quality de novo sequence assemblies for a small core set of representative genotypes, (2) short-read sequencing of a large diversity panel of genebank accessions to medium coverage and (3) the use of complementary methods such as chromosome-conformation capture sequencing and k-mer-based association genetics. The in silico representation of the barley pan-genome may inform about the mechanisms of structural genome evolution in the Triticeae and supplement quantitative genetics models of crop performance for better accuracy and predictive ability.
泛基因组的概念是指基因组内容和结构的种内多样性,包括基因和基因间区。泛基因组研究采用从头测序组装和基于参考的比对相结合的方法来发现和分型结构变异。很长一段时间以来,三叶草族基因组的庞大尺寸和复杂结构一直是大麦及其近缘种基因组研究的障碍。现在有了参考基因组,高质量序列组装的计算流程已经到位,并且序列成本持续下降,对大麦基因组结构多样性的研究似乎触手可及。在这里,我们回顾了模式禾本科植物短柄草、禾本科作物水稻和玉米的泛基因组研究的最新进展,并为大麦的泛基因组项目设计了一个多层次的策略。我们的设计包括:(1)为一小部分有代表性的基因型构建高质量的从头测序组装;(2)对大量基因库材料进行中等覆盖度的短读测序;(3)使用互补方法,如染色体构象捕获测序和基于 k-mer 的关联遗传学。大麦泛基因组的计算机表示可能有助于了解三叶草族中结构基因组进化的机制,并为作物表现的数量遗传学模型提供补充,以提高准确性和预测能力。