Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, Georgia 30602, USA.
Institute of Bioinformatics, University of Georgia, Athens, Georgia 30602, USA.
Genome Res. 2022 Jan;32(1):203-213. doi: 10.1101/gr.275325.121. Epub 2021 Nov 11.
Cryptosporidiosis is a leading cause of waterborne diarrheal disease globally and an important contributor to mortality in infants and the immunosuppressed. Despite its importance, the community has only had access to a good, but incomplete, IOWA reference genome sequence. Incomplete reference sequences hamper annotation, experimental design, and interpretation. We have generated a new IOWA genome assembly supported by Pacific Biosciences (PacBio) and Oxford Nanopore long-read technologies and a new comparative and consistent genome annotation for three closely related species: , , and We made 1926 annotation updates based on experimental evidence. They include new transporters, ncRNAs, introns, and altered gene structures. The new assembly and annotation revealed a complete methylase ortholog. Comparative annotation between , , and revealed that most "missing" orthologs are found, suggesting that the biological differences between the species must result from gene copy number variation, differences in gene regulation, and single-nucleotide variants (SNVs). Using the new assembly and annotation as reference, 190 genes are identified as evolving under positive selection, including many not detected previously. The new IOWA reference genome assembly is larger, gap free, and lacks ambiguous bases. This chromosomal assembly recovers all 16 chromosome ends, 13 of which are contiguously assembled. The three remaining chromosome ends are provisionally placed. These ends represent duplication of entire chromosome ends including subtelomeric regions revealing a new level of genome plasticity that will both inform and impact future research.
隐孢子虫病是全球主要的水源性腹泻病病原体,也是导致婴儿和免疫抑制人群死亡的重要原因。尽管它很重要,但该领域仅获得了一个良好但不完整的爱荷华参考基因组序列。不完整的参考序列妨碍了注释、实验设计和解释。我们使用 Pacific Biosciences(PacBio)和 Oxford Nanopore 长读测序技术生成了一个新的爱荷华基因组组装,并对三个密切相关的物种:、和 进行了新的比较和一致的基因组注释。我们基于实验证据进行了 1926 次注释更新。它们包括新的转运蛋白、ncRNA、内含子和改变的基因结构。新的组装和注释揭示了一个完整的甲基酶同源物。、和 之间的比较注释表明,大多数“缺失”的同源物都被发现,这表明这些物种之间的生物学差异必须归因于基因拷贝数变异、基因调控差异和单核苷酸变异(SNV)。使用新的组装和注释作为参考,鉴定出 190 个基因在正向选择下进化,其中包括许多以前未检测到的基因。新的爱荷华参考基因组组装更大、无间隙且不含模糊碱基。这个染色体组装恢复了所有 16 条染色体末端,其中 13 条连续组装。其余的三个染色体末端暂时放置。这些末端代表整个染色体末端的重复,包括端粒区域,揭示了一个新的基因组可塑性水平,这将为未来的研究提供信息并产生影响。