Department of Biological Sciences, Dietrich School of Arts and Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA.
Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.
Genome Res. 2021 May;31(5):834-851. doi: 10.1101/gr.262816.120. Epub 2021 Apr 27.
is a useful model for intracellular parasitism given its ease of culture in the laboratory and genomic resources. However, as for many other eukaryotes, the genome contains hundreds of sequence gaps owing to repetitive and/or unclonable sequences that disrupt the assembly process. Here, we use the Oxford Nanopore Minion platform to generate near-complete de novo genome assemblies for multiple strains of and its near relative, We significantly improved genome contiguity (average N50 of ∼6.6 Mb) and added ∼2 Mb of newly assembled sequence. For all of the strains that we sequenced (RH, ME49, CTG, II×III progeny clones CL13, S27, S21, S26, and D3X1), the largest contig ranged in size between 11.9 and 12.1 Mb in size, which is larger than any previously reported chromosome, and found to be due to a consistent fusion of Chromosomes VIIb and VIII. These data were validated by mapping existing ME49 Hi-C data to our assembly, providing parallel lines of evidence that the karyotype consists of 13, rather than 14, chromosomes. By using this technology, we also resolved hundreds of tandem repeats of varying lengths, including in well-known host-targeting effector loci like rhoptry protein 5 () and Finally, when we compared with , we found that although the 13-chromosome karyotype was conserved, extensive, previously unappreciated chromosome-scale rearrangements had occurred in and since their most recent common ancestry.
是一种有用的细胞内寄生模式生物,因为它易于在实验室中培养,并且具有基因组资源。然而,与许多其他真核生物一样,由于重复序列和/或无法克隆的序列会破坏组装过程,基因组包含数百个序列缺口。在这里,我们使用牛津纳米孔 Minion 平台为多个 和其近缘种 的菌株生成近乎完整的从头基因组组装。我们显著提高了 基因组的连续性(平均 N50 约为 6.6Mb),并添加了约 2Mb 的新组装序列。对于我们测序的所有 菌株(RH、ME49、CTG、II×III 后代克隆 CL13、S27、S21、S26 和 D3X1),最大的连续序列大小在 11.9 到 12.1Mb 之间,大于以前报道的任何 染色体,并且发现这是由于染色体 VIIb 和 VIII 的一致融合。通过将现有的 ME49 Hi-C 数据映射到我们的组装,验证了这些数据,提供了平行的证据表明 核型由 13 条而不是 14 条染色体组成。通过使用这项技术,我们还解决了数百个长度不同的串联重复序列,包括在众所周知的宿主靶向效应子基因座,如 rhoptry 蛋白 5 () 和 最后,当我们比较 和 时,我们发现尽管 13 条染色体核型是保守的,但在 和 自它们最近的共同祖先以来,发生了广泛的、以前未被认识到的染色体尺度重排。