Rancu Isabel, Sobkowiak Benjamin, Warren Joshua L, Ciobanu Nelly, Codreanu Alexandru, Crudu Valeriu, Colijn Caroline, Cohen Ted, Chitwood Melanie H
Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, USA.
Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA.
Access Microbiol. 2025 May 12;7(5). doi: 10.1099/acmi.0.000964.v3. eCollection 2025.
Over the past three decades, molecular epidemiological studies have provided new opportunities to investigate the transmission dynamics of . In most studies, a sizable fraction of individuals with notified tuberculosis cannot be included, either because they do not have culture-positive disease (and thus do not have specimens available for molecular typing) or because resources for conducting sequencing are limited. A recent study introduced a regression-based approach for inferring the membership of unsequenced tuberculosis cases in transmission clusters based on host demographic and epidemiological data. This method was able to identify the most likely cluster to which an unsequenced strain belonged with an accuracy of 35%, although this was in a low-burden setting where a large fraction of cases occurred among foreign-born migrants. Here, we apply a similar model to whole-genome sequencing data from the Republic of Moldova, a setting of relatively high local transmission. Using a maximum cluster span of 40 single nucleotide polymorphisms (SNPs) and a cluster size cutoff of ≥10, we could best predict the specific cluster to which each clustered case was most likely to be a member with an accuracy of 17.2 %. In sensitivity analyses, we found that a more restrictive (20 SNPs threshold) or permissive (~80 SNPs) threshold did not improve performance. We found that increasing the minimum cluster size improved prediction accuracy. These findings highlight the challenges of transmission inference in high-burden settings like Moldova.
在过去三十年中,分子流行病学研究为调查[疾病名称未给出]的传播动态提供了新机会。在大多数研究中,相当一部分已通报的结核病患者无法纳入,要么是因为他们没有培养阳性疾病(因此没有可用于分子分型的标本),要么是因为进行测序的资源有限。最近一项研究引入了一种基于回归的方法,用于根据宿主人口统计学和流行病学数据推断未测序结核病病例在传播簇中的归属。该方法能够以35%的准确率识别未测序菌株最可能所属的簇,尽管这是在一个低负担环境中,其中很大一部分病例发生在外国出生的移民中。在此,我们将类似模型应用于摩尔多瓦共和国的全基因组测序数据,该国是一个本地传播相对较高的地区。使用约40个单核苷酸多态性(SNP)的最大簇跨度和≥10的簇大小截止值,我们能够以17.2%的准确率最佳预测每个聚类病例最可能所属的特定簇。在敏感性分析中,我们发现更严格的(约20个SNP阈值)或宽松的(约80个SNP)阈值并不能提高性能。我们发现增加最小簇大小可提高预测准确率。这些发现凸显了在摩尔多瓦这样的高负担环境中进行传播推断的挑战。