Siegel Sasha V, Trimarsanto Hidayat, Amato Roberto, Murie Kathryn, Taylor Aimee R, Sutanto Edwin, Kleinecke Mariana, Whitton Georgia, Watson James A, Imwong Mallika, Assefa Ashenafi, Rahim Awab Ghulam, Nguyen Hoang Chau, Tran Tinh Hien, Green Justin A, Koh Gavin C K W, White Nicholas J, Day Nicholas, Kwiatkowski Dominic P, Rayner Julian C, Price Ric N, Auburn Sarah
Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.
Menzies School of Health Research and Charles Darwin University, Darwin, Northern Territory, 0811, Australia.
Nat Commun. 2024 Aug 8;15(1):6757. doi: 10.1038/s41467-024-51015-3.
Challenges in classifying recurrent Plasmodium vivax infections constrain surveillance of antimalarial efficacy and transmission. Recurrent infections may arise from activation of dormant liver stages (relapse), blood-stage treatment failure (recrudescence) or reinfection. Molecular inference of familial relatedness (identity-by-descent or IBD) can help resolve the probable origin of recurrences. As whole genome sequencing of P. vivax remains challenging, targeted genotyping methods are needed for scalability. We describe a P. vivax marker discovery framework to identify and select panels of microhaplotypes (multi-allelic markers within small, amplifiable segments of the genome) that can accurately capture IBD. We evaluate panels of 50-250 microhaplotypes discovered in a global set of 615 P. vivax genomes. A candidate global 100-microhaplotype panel exhibits high marker diversity in the Asia-Pacific, Latin America and horn of Africa (median H = 0.70-0.81) and identifies 89% of the polyclonal infections detected with genome-wide datasets. Data simulations reveal lower error in estimating pairwise IBD using microhaplotypes relative to traditional biallelic SNP barcodes. The candidate global panel also exhibits high accuracy in predicting geographic origin and captures local infection outbreak and bottlenecking events. Our framework is open-source enabling customised microhaplotype discovery and selection, with potential for porting to other species or data resources.
间日疟原虫复发性感染的分类挑战限制了抗疟疗效监测和传播监测。复发性感染可能源于休眠肝期的激活(复发)、血液期治疗失败(再燃)或再次感染。家族相关性的分子推断(同源性或IBD)有助于确定复发的可能来源。由于间日疟原虫的全基因组测序仍然具有挑战性,因此需要靶向基因分型方法以实现可扩展性。我们描述了一个间日疟原虫标记发现框架,以识别和选择能够准确捕获IBD的微单倍型(基因组小的可扩增片段内的多等位基因标记)面板。我们评估了在全球615个间日疟原虫基因组中发现的50 - 250个微单倍型的面板。一个候选的全球100微单倍型面板在亚太地区、拉丁美洲和非洲之角表现出高标记多样性(中位数H = 0.70 - 0.81),并识别了用全基因组数据集检测到的89%的多克隆感染。数据模拟显示,相对于传统的双等位基因SNP条形码,使用微单倍型估计成对IBD时误差更低。候选全球面板在预测地理来源方面也表现出高精度,并捕获了局部感染爆发和瓶颈事件。我们的框架是开源的,可实现定制的微单倍型发现和选择,有可能移植到其他物种或数据资源。