Penumarthi Lasya R, Baptista Rodrigo P, Beaudry Megan S, Glenn Travis C, Kissinger Jessica C
Institute of Bioinformatics, University of Georgia. Athens, Georgia. 30602, USA.
Center for Tropical and Emerging Global Diseases, University of Georgia. Athens, Georgia 30602, USA.
bioRxiv. 2024 Feb 17:2024.02.16.580748. doi: 10.1101/2024.02.16.580748.
spp. are medically and scientifically relevant protozoan parasites that cause severe diarrheal illness in infants and immunosuppressed populations as well as animals. Although most human infections are caused by and , there are several other human-infecting species including , which is commonly observed in developing countries. Here, we polished and annotated a long-read genome sequence assembly for TU1867, a species which infects birds and humans. The genome sequence was generated using a combination of whole genome amplification (WGA) and long-read Oxford Nanopore Technologies sequencing. The assembly was then polished with Illumina data. The chromosome-level genome assembly is 9.2 Mbp with a contig N50 of 1.1 Mb. Annotation revealed 3,923 protein-coding genes. A BUSCO analysis indicates a completeness of 96.6% (n=446), including 430 (96.4%) single-copy and 1 (0.224%) duplicated apicomplexan conserved gene(s). The new genome assembly is nearly gap-free and provides a valuable new resource for the community and future studies on evolution and host-specificity.
某些物种是医学和科学领域中相关的原生动物寄生虫,可在婴儿、免疫抑制人群以及动物中引发严重的腹泻疾病。尽管大多数人类感染是由[具体物种1]和[具体物种2]引起的,但还有其他几种可感染人类的物种,包括[具体物种3],在发展中国家较为常见。在此,我们对TU1867的长读长基因组序列组装进行了优化和注释,TU1867是一种可感染鸟类和人类的物种。基因组序列是通过全基因组扩增(WGA)和长读长牛津纳米孔技术测序相结合生成的。然后用Illumina数据对组装结果进行了优化。染色体水平的基因组组装大小为9.2 Mbp,重叠群N50为1.1 Mb。注释显示有3923个蛋白质编码基因。BUSCO分析表明完整性为96.6%(n = 446),包括430个(96.4%)单拷贝和1个(0.224%)重复的顶复门保守基因。新的[物种名称]基因组组装几乎无间隙,为[物种名称]研究群体以及未来关于进化和宿主特异性的研究提供了宝贵的新资源。