Department of Biostatistics, School of Public Health, University of California, Berkeley, CA 94704, United States.
EPPIcenter Program, Division of HIV, ID and Global Medicine, University of California, San Francisco, CA 94110, United States.
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae619.
Malaria parasite genetic data can provide insight into parasite phenotypes, evolution, and transmission. However, estimating key parameters such as allele frequencies, multiplicity of infection (MOI), and within-host relatedness from genetic data is challenging, particularly in the presence of multiple related coinfecting strains. Existing methods often rely on single nucleotide polymorphism (SNP) data and do not account for within-host relatedness.
We present Multiplicity Of Infection and allele frequency REcovery (MOIRE), a Bayesian approach to estimate allele frequencies, MOI, and within-host relatedness from genetic data subject to experimental error. MOIRE accommodates both polyallelic and SNP data, making it applicable to diverse genotyping panels. We also introduce a novel metric, the effective MOI (eMOI), which integrates MOI and within-host relatedness, providing a robust and interpretable measure of genetic diversity. Extensive simulations and real-world data from a malaria study in Namibia demonstrate the superior performance of MOIRE over naive estimation methods, accurately estimating MOI up to seven with moderate-sized panels of diverse loci (e.g. microhaplotypes). MOIRE also revealed substantial heterogeneity in population mean MOI and mean relatedness across health districts in Namibia, suggesting detectable differences in transmission dynamics. Notably, eMOI emerges as a portable metric of within-host diversity, facilitating meaningful comparisons across settings when allele frequencies or genotyping panels differ. Compared to existing software, MOIRE enables more comprehensive insights into within-host diversity and population structure.
MOIRE is available as an R package at https://eppicenter.github.io/moire/.
疟原虫遗传数据可以提供有关寄生虫表型、进化和传播的深入了解。然而,从遗传数据中估计等位基因频率、多重感染(MOI)和宿主内相关性等关键参数具有挑战性,尤其是在存在多个相关共感染株的情况下。现有的方法通常依赖于单核苷酸多态性(SNP)数据,并且不考虑宿主内相关性。
我们提出了多重感染和等位基因频率恢复(MOIRE),这是一种贝叶斯方法,可以从遗传数据中估计等位基因频率、MOI 和宿主内相关性,同时考虑实验误差。MOIRE 可以容纳多态性和 SNP 数据,使其适用于各种基因分型面板。我们还引入了一个新的度量标准,有效 MOI(eMOI),它将 MOI 和宿主内相关性整合在一起,提供了一种稳健且可解释的遗传多样性度量标准。广泛的模拟和纳米比亚疟疾研究的真实数据表明,MOIRE 优于简单的估计方法,能够准确估计多达七个具有不同基因座(例如微单倍型)的适度大小面板的 MOI。MOIRE 还揭示了纳米比亚各卫生区人群平均 MOI 和平均相关性的显著异质性,表明在传播动力学方面存在可检测的差异。值得注意的是,eMOI 是宿主内多样性的便携度量标准,当等位基因频率或基因分型面板不同时,它可以促进不同环境之间有意义的比较。与现有软件相比,MOIRE 可以更全面地了解宿主内多样性和种群结构。
MOIRE 可在 https://eppicenter.github.io/moire/ 作为 R 包获得。