DOE Joint Genome Institute, Berkeley, California 94720, USA; email:
Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA.
Annu Rev Plant Biol. 2021 Jun 17;72:411-435. doi: 10.1146/annurev-arplant-080720-105454. Epub 2021 Apr 13.
A pan-genome is the nonredundant collection of genes and/or DNA sequences in a species. Numerous studies have shown that plant pan-genomes are typically much larger than the genome of any individual and that a sizable fraction of the genes in any individual are present in only some genomes. The construction and interpretation of plant pan-genomes are challenging due to the large size and repetitive content of plant genomes. Most pan-genomes are largely focused on nontransposable element protein coding genes because they are more easily analyzed and defined than noncoding and repetitive sequences. Nevertheless, noncoding and repetitive DNA play important roles in determining the phenotype and genome evolution. Fortunately, it is now feasible to make multiple high-quality genomes that can be used to construct high-resolution pan-genomes that capture all the variation. However, assembling, displaying, and interacting with such high-resolution pan-genomes will require the development of new tools.
泛基因组是指一个物种中非冗余的基因和/或 DNA 序列集合。许多研究表明,植物泛基因组通常比任何单个个体的基因组大得多,而且个体中相当一部分基因仅存在于某些基因组中。由于植物基因组的大小和重复内容较大,因此构建和解释植物泛基因组具有挑战性。大多数泛基因组主要集中在非转座元件蛋白编码基因上,因为它们比非编码和重复序列更容易分析和定义。然而,非编码和重复 DNA 在决定表型和基因组进化方面发挥着重要作用。幸运的是,现在已经可以构建多个高质量的基因组,这些基因组可以用于构建能够捕获所有变异的高分辨率泛基因组。然而,组装、显示和与这种高分辨率泛基因组交互将需要开发新的工具。