Fiol Arnau, Jurado-Ruiz Federico, López-Girona Elena, Aranzana Maria José
Centre for Research in Agricultural Genomics, CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain.
The New Zealand Institute for Plant and Food Research Limited (Plant & Food Research), Private Bag 11600, Palmerston North, 4442, New Zealand.
Plant Methods. 2022 Aug 27;18(1):105. doi: 10.1186/s13007-022-00937-4.
Genome complexity is largely linked to diversification and crop innovation. Examples of regions with duplicated genes with relevant roles in agricultural traits are found in many crops. In both duplicated and non-duplicated genes, much of the variability in agronomic traits is caused by large as well as small and middle scale structural variants (SVs), which highlights the relevance of the identification and characterization of complex variability between genomes for plant breeding.
Here we improve and demonstrate the use of CRISPR-Cas9 enrichment combined with long-read sequencing technology to resolve the MYB10 region in the linkage group 3 (LG3) of Japanese plum (Prunus salicina). This region, which has a length from 90 to 271 kb according to the P. salicina genomes available, is associated with fruit color variability in Prunus species. We demonstrate the high complexity of this region, with homology levels between Japanese plum varieties comparable to those between Prunus species. We cleaved MYB10 genes in five plum varieties using the Cas9 enzyme guided by a pool of crRNAs. The barcoded fragments were then pooled and sequenced in a single MinION Oxford Nanopore Technologies (ONT) run, yielding 194 Mb of sequence. The enrichment was confirmed by aligning the long reads to the plum reference genomes, with a mean read on-target value of 4.5% and a depth per sample of 11.9x. From the alignment, 3261 SNPs and 287 SVs were called and phased. A de novo assembly was constructed for each variety, which also allowed detection, at the haplotype level, of the variability in this region.
CRISPR-Cas9 enrichment is a versatile and powerful tool for long-read targeted sequencing even on highly duplicated and/or polymorphic genomic regions, being especially useful when a reference genome is not available. Potential uses of this methodology as well as its limitations are further discussed.
基因组复杂性在很大程度上与物种多样化和作物创新相关。在许多作物中都发现了具有与农业性状相关的重复基因的区域。在重复和非重复基因中,农艺性状的许多变异性是由大以及中小规模的结构变异(SVs)引起的,这突出了鉴定和表征基因组间复杂变异性对植物育种的重要性。
在这里,我们改进并展示了结合长读长测序技术的CRISPR-Cas9富集方法在解析日本李(Prunus salicina)第3连锁群(LG3)中的MYB10区域的应用。根据现有的李属基因组,该区域长度为90至271 kb,与李属物种的果实颜色变异性相关。我们证明了该区域的高度复杂性,日本李品种之间的同源性水平与李属物种之间的相当。我们使用一组crRNAs引导的Cas9酶在五个李品种中切割MYB10基因。然后将带有条形码的片段混合,并在一次MinION Oxford Nanopore Technologies(ONT)测序运行中进行测序,产生了194 Mb的序列。通过将长读长与李属参考基因组比对确认了富集效果,平均读段靶向值为4.5%,每个样本的深度为11.9倍。从比对结果中,共鉴定并分型了3261个单核苷酸多态性(SNPs)和287个结构变异(SVs)。为每个品种构建了从头组装,这也使得能够在单倍型水平上检测该区域的变异性。
CRISPR-Cas9富集是一种通用且强大的工具,即使在高度重复和/或多态的基因组区域也能用于长读长靶向测序,在没有参考基因组的情况下尤其有用。进一步讨论了该方法的潜在用途及其局限性。