Suppr超能文献

用于疟原虫时空监测的谱系信息微单倍型

Lineage-informative microhaplotypes for spatio-temporal surveillance of malaria parasites.

作者信息

Siegel Sasha V, Amato Roberto, Trimarsanto Hidayat, Sutanto Edwin, Kleinecke Mariana, Murie Kathryn, Whitton Georgia, Taylor Aimee R, Watson James A, Imwong Mallika, Assefa Ashenafi, Rahim Awab Ghulam, Chau Nguyen Hoang, Hien Tran Tinh, Green Justin A, Koh Gavin, White Nicholas J, Day Nicholas, Kwiatkowski Dominic P, Rayner Julian C, Price Ric N, Auburn Sarah

机构信息

Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

Menzies School of Health Research and Charles Darwin University, Darwin, Northern Territory 0811, Australia.

出版信息

medRxiv. 2023 Mar 16:2023.03.13.23287179. doi: 10.1101/2023.03.13.23287179.

Abstract

Challenges in understanding the origin of recurrent infections constrains the surveillance of antimalarial efficacy and transmission of this neglected parasite. Recurrent infections within an individual may arise from activation of dormant liver stages (relapse), blood-stage treatment failure (recrudescence) or new inoculations (reinfection). Molecular inference of familial relatedness (identity-by-descent or IBD) based on whole genome sequence data, together with analysis of the intervals between parasitaemic episodes ("time-to-event" analysis), can help resolve the probable origin of recurrences. Whole genome sequencing of predominantly low-density infections is challenging, so an accurate and scalable genotyping method to determine the origins of recurrent parasitaemia would be of significant benefit. We have developed a genome-wide informatics pipeline to select specific microhaplotype panels that can capture IBD within small, amplifiable segments of the genome. Using a global set of 615 genomes, we derived a panel of 100 microhaplotypes, each comprising 3-10 high frequency SNPs within <200 bp sequence windows. This panel exhibits high diversity in regions of the Asia-Pacific, Latin America and the horn of Africa (median = 0.70-0.81) and it captured 89% (273/307) of the polyclonal infections detected with genome-wide datasets. Using data simulations, we demonstrate lower error in estimating pairwise IBD using microhaplotypes, relative to traditional biallelic SNP barcodes. Our panel exhibited high accuracy in predicting the country of origin (median Matthew's correlation coefficient >0.9 in 90% countries tested) and it also captured local infection outbreak and bottlenecking events. The informatics pipeline is available open-source and yields microhaplotypes that can be readily transferred to high-throughput amplicon sequencing assays for surveillance in malaria-endemic regions.

摘要

了解复发性感染的起源所面临的挑战限制了对这种被忽视寄生虫的抗疟疗效监测和传播情况的监测。个体内的复发性感染可能源于休眠肝期的激活(复发)、血期治疗失败(再燃)或新的感染(再感染)。基于全基因组序列数据的家族相关性分子推断(同源性或IBD),以及对寄生虫血症发作间隔的分析(“事件发生时间”分析),有助于确定复发的可能起源。对主要为低密度感染进行全基因组测序具有挑战性,因此,一种准确且可扩展的基因分型方法来确定复发性寄生虫血症的起源将具有重大益处。我们开发了一种全基因组信息学流程,以选择特定的微单倍型面板,该面板可在基因组的小的可扩增片段内捕获IBD。使用一组全球范围内的615个基因组,我们得出了一个由100个微单倍型组成的面板,每个微单倍型在<200 bp的序列窗口内包含3-10个高频单核苷酸多态性(SNP)。该面板在亚太地区、拉丁美洲和非洲之角地区表现出高度多样性(中位数=0.70-0.81),并且它捕获了全基因组数据集检测到的89%(273/307)的多克隆感染。通过数据模拟,我们证明相对于传统的双等位基因SNP条形码,使用微单倍型估计成对IBD时误差更低。我们的面板在预测起源国方面表现出高精度(在90%的测试国家中,马修斯相关系数中位数>0.9),并且它还捕获了局部感染爆发和瓶颈事件。该信息学流程是开源的,并且产生的微单倍型可以很容易地转移到高通量扩增子测序检测中,用于疟疾流行地区的监测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/936c/10055443/91837759a238/nihpp-2023.03.13.23287179v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验