Auburn Sarah, Böhme Ulrike, Steinbiss Sascha, Trimarsanto Hidayat, Hostetler Jessica, Sanders Mandy, Gao Qi, Nosten Francois, Newbold Chris I, Berriman Matthew, Price Ric N, Otto Thomas D
Global and Tropical Health Division, Menzies School of Health Research and Charles Darwin University, Darwin, Australia.
Malaria Programme, Wellcome Trust Sanger Institute, Hinxton, UK.
Wellcome Open Res. 2016 Nov 15;1:4. doi: 10.12688/wellcomeopenres.9876.1.
is now the predominant cause of malaria in the Asia-Pacific, South America and Horn of Africa. Laboratory studies of this species are constrained by the inability to maintain the parasite in continuous culture, but genomic approaches provide an alternative and complementary avenue to investigate the parasite's biology and epidemiology. To date, molecular studies of have relied on the Salvador-I reference genome sequence, derived from a monkey-adapted strain from South America. However, the Salvador-I reference remains highly fragmented with over 2500 unassembled scaffolds. Using high-depth Illumina sequence data, we assembled and annotated a new reference sequence, PvP01, sourced directly from a patient from Papua Indonesia. Draft assemblies of isolates from China (PvC01) and Thailand (PvT01) were also prepared for comparative purposes. The quality of the PvP01 assembly is improved greatly over Salvador-I, with fragmentation reduced to 226 scaffolds. Detailed manual curation has ensured highly comprehensive annotation, with functions attributed to 58% core genes in PvP01 versus 38% in Salvador-I. The assemblies of PvP01, PvC01 and PvT01 are larger than that of Salvador-I (28-30 versus 27 Mb), owing to improved assembly of the subtelomeres. An extensive repertoire of over 1200 interspersed repeat () genes were identified in PvP01 compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development. The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria. PvP01 is maintained at GeneDB and ongoing curation will ensure continual improvements in assembly and annotation quality.
目前是亚太地区、南美洲和非洲之角疟疾的主要病因。对该物种的实验室研究受到无法在连续培养中维持寄生虫的限制,但基因组方法提供了一条替代且互补的途径来研究寄生虫的生物学和流行病学。迄今为止,对[该物种]的分子研究依赖于源自南美洲猴适应株的萨尔瓦多 - I参考基因组序列。然而,萨尔瓦多 - I参考基因组仍然高度碎片化,有超过2500个未组装的支架。利用高深度的Illumina序列数据,我们组装并注释了一个新的参考序列PvP01,它直接来源于印度尼西亚巴布亚的一名患者。还制备了来自中国(PvC01)和泰国(PvT01)分离株的草图组装用于比较。PvP01组装的质量比萨尔瓦多 - I有了很大提高,碎片化减少到226个支架。详细的人工编辑确保了高度全面的注释,PvP01中有58%的核心基因被赋予功能,而萨尔瓦多 - I中为38%。由于端粒亚端粒组装的改进,PvP01、PvC01和PvT01的组装比萨尔瓦多 - I更大(分别为28 - 30 Mb对27 Mb)。与萨尔瓦多 - I中的346个相比,在PvP01中鉴定出了超过1200个散布重复()基因的丰富库,表明其在寄生虫存活或发育中起着至关重要作用。人工编辑的PvP01参考序列以及PvC01和PvT01草图组装是研究间日疟的重要新资源。PvP01保存在基因数据库中,持续的编辑将确保组装和注释质量不断提高。