Suppr超能文献

利用多等位基因标记估计感染复数、单倍型频率和连锁不平衡以进行分子疾病监测。

Estimating multiplicity of infection, haplotype frequencies, and linkage disequilibria from multi-allelic markers for molecular disease surveillance.

作者信息

Tsoungui Obama Henri Christian Junior, Schneider Kristan Alexander

机构信息

Department of Applied Computer- and Biosciences, University of Applied Sciences Mittweida, Mittweida, Germany.

Department of Mathematics, Chemnitz University of Technology, Chemnitz, Germany.

出版信息

PLoS One. 2025 May 27;20(5):e0321723. doi: 10.1371/journal.pone.0321723. eCollection 2025.

Abstract

BACKGROUND

Molecular/genetic methods are becoming increasingly important for surveillance of diseases like malaria. Such methods allow monitoring routes of disease transmission or the origin and spread of variants associated with drug resistance. A confounding factor in molecular disease surveillance is the presence of multiple distinct variants in the same infection (multiplicity of infection - MOI), which leads to ambiguity when reconstructing which pathogenic variants are present in an infection. Heuristic approaches often ignore ambiguous infections, which leads to biased results.

METHODS

To avoid bias, we introduce a statistical framework to estimate haplotype frequencies alongside MOI from a pair of multi-allelic molecular markers. Estimates are based on maximum likelihood using the expectation-maximization (EM)-algorithm. The estimates can be used as plug-ins to construct pairwise linkage disequilibrium (LD) maps. The finite-sample properties of the proposed method are studied by systematic numerical simulations. These reveal that the EM-algorithm is a numerically stable method in our case and that the proposed method is accurate (little bias) and precise (small variance) for a reasonable sample size. In fact, the results suggest that the estimator is asymptotically unbiased. Furthermore, the method is appropriate to estimate LD (by [Formula: see text], [Formula: see text], [Formula: see text], or conditional asymmetric LD). Furthermore, as an illustration, we apply the new method to a previously published dataset from Cameroon concerning sulfadoxine-pyrimethamine (SP) resistance. The results are in accordance with the SP drug pressure at the time and the observed spread of resistance in the country, yielding further evidence for the adequacy of the proposed method.

CONCLUSION

The proposed method can be readily applied in practice for malaria disease surveillance as a replacement for heuristic methods. The first benefit is its ability to estimate MOI, which scales with transmission intensities, and, in a temporal context, can be used to evaluate the effectiveness of disease control measures. MOI is best estimated from molecular markers that are not under selection (neutral markers) and exhibit sufficient genetic variation. The second advantage is that it can estimate pairwise LD without deflating sample size as in heuristic methods, thereby limiting uncertainty in the estimates. This is particularly useful when deriving LD maps from data with many ambiguous observations due to MOI. Importantly, the method per se is not restricted to malaria, but applicable to any disease with a similar transmission pattern. The method and several extensions are implemented in an easy-to-use R script.

摘要

背景

分子/基因方法在疟疾等疾病监测中变得越来越重要。此类方法可用于监测疾病传播途径或与耐药性相关的变异体的起源和传播。分子疾病监测中的一个混杂因素是同一感染中存在多个不同的变异体(感染复数 - MOI),这在重建感染中存在哪些致病变异体时会导致模糊性。启发式方法通常会忽略模糊感染,从而导致有偏差的结果。

方法

为避免偏差,我们引入了一个统计框架,用于从一对多等位基因分子标记中估计单倍型频率以及MOI。估计基于使用期望最大化(EM)算法的最大似然法。这些估计可作为插件用于构建成对连锁不平衡(LD)图谱。通过系统的数值模拟研究了所提出方法的有限样本性质。结果表明,在我们的案例中EM算法是一种数值稳定的方法,并且对于合理的样本量,所提出的方法是准确的(偏差小)和精确的(方差小)。事实上,结果表明估计量是渐近无偏的。此外,该方法适用于估计LD(通过[公式:见正文]、[公式:见正文]、[公式:见正文]或条件不对称LD)。此外,作为示例,我们将新方法应用于先前发表的来自喀麦隆的关于磺胺多辛 - 乙胺嘧啶(SP)耐药性的数据集。结果与当时的SP药物压力以及该国观察到的耐药性传播情况一致,为所提出方法的适用性提供了进一步证据。

结论

所提出的方法可在疟疾疾病监测实践中轻松应用,以替代启发式方法。第一个好处是它能够估计MOI,MOI与传播强度相关,并且在时间背景下可用于评估疾病控制措施的有效性。最好从不受选择影响(中性标记)且具有足够遗传变异的分子标记中估计MOI。第二个优点是它可以估计成对LD,而不像启发式方法那样减少样本量,从而限制估计中的不确定性。当从由于MOI而有许多模糊观察的数据中推导LD图谱时,这特别有用。重要的是,该方法本身并不限于疟疾,而是适用于任何具有类似传播模式的疾病。该方法及其几个扩展已在一个易于使用的R脚本中实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a0c/12111651/3dc73dac3986/pone.0321723.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验