Suppr超能文献

一种检测结核分枝杆菌混合感染并重建组成菌株的新方法为传播研究提供了见解。

A new method for detecting mixed Mycobacterium tuberculosis infection and reconstructing constituent strains provides insights into transmission.

作者信息

Sobkowiak Benjamin, Cudahy Patrick, Chitwood Melanie H, Clark Taane G, Colijn Caroline, Grandjean Louis, Walter Katharine S, Crudu Valeriu, Cohen Ted

机构信息

Department of Epidemiology of Microbial Disease, Yale School of Public Health, 60 College Street, New Haven, CT, USA.

Department of Infection, Immunity and Inflammation, Institute of Child Health, University College London, London, UK.

出版信息

Genome Med. 2025 Jan 27;17(1):8. doi: 10.1186/s13073-025-01430-y.

Abstract

BACKGROUND

Mixed infection with multiple strains of the same pathogen in a single host can present clinical and analytical challenges. Whole genome sequence (WGS) data can identify signals of multiple strains in samples, though the precision of previous methods can be improved. Here, we present MixInfect2, a new tool to accurately detect mixed samples from Mycobacterium tuberculosis short-read WGS data. We then evaluate three approaches for reconstructing the underlying mixed constituent strain sequences. This allows these samples to be included in downstream analysis to gain insights into the epidemiology and transmission of mixed infections.

METHODS

We employed a Gaussian mixture model to cluster allele frequencies at mixed sites (hSNPs) in each sample to identify signals of multiple strains. Building upon our previous tool, MixInfect, we increased the accuracy of classifying in vitro mixed samples through multiple improvements to the bioinformatic pipeline. Major and minor proportion constituent strains were reconstructed using three approaches and assessed by comparing the estimated sequence to the known constituent strain sequence. Lastly, mixed infections in a real-world Mycobacterium tuberculosis population from Moldova were detected with MixInfect2 and clusters of recent transmission that included major and minor constituent strains were built.

RESULTS

All 36/36 in vitro mixed and 12/12 non-mixed samples were correctly classified with MixInfect2, and major strain proportions were estimated with high accuracy (within 3% of the true strain proportion), outperforming previous tools. Reconstructed major strain sequences closely matched the true constituent sequence by taking the allele at the highest frequency at hSNPs, while the best-performing approach to reconstruct the minor proportion strain sequence was identifying the closest non-mixed isolate in the same population, though no approach was effective when the minor strain proportion was at 5%. Finally, fewer mixed infections were identified in Moldova than previous estimates (6.6% vs 17.4%) and we found multiple instances where the constituent strains of mixed samples were present in transmission clusters.

CONCLUSIONS

MixInfect2 accurately detects samples with evidence of mixed infection from short-read WGS data and provides an excellent estimate of the mixture proportions. While there are limitations in reconstructing the constituent strain sequences of mixed samples, we present recommendations for the best approach to include these isolates in further analyses.

摘要

背景

同一病原体的多种菌株在单个宿主中发生混合感染会带来临床和分析方面的挑战。全基因组序列(WGS)数据可以识别样本中多种菌株的信号,不过之前方法的精度还有提升空间。在此,我们介绍MixInfect2,这是一种从结核分枝杆菌短读长WGS数据中准确检测混合样本的新工具。然后,我们评估了三种用于重建潜在混合组成菌株序列的方法。这使得这些样本能够纳入下游分析,以深入了解混合感染的流行病学和传播情况。

方法

我们采用高斯混合模型对每个样本中混合位点(异源单核苷酸多态性,hSNP)的等位基因频率进行聚类,以识别多种菌株的信号。在我们之前的工具MixInfect的基础上,我们通过对生物信息学流程进行多项改进,提高了对体外混合样本分类的准确性。使用三种方法重建主要和次要比例的组成菌株,并通过将估计序列与已知组成菌株序列进行比较来评估。最后,使用MixInfect2检测了摩尔多瓦实际结核分枝杆菌群体中的混合感染情况,并构建了包含主要和次要组成菌株的近期传播簇。

结果

MixInfect2正确分类了所有36/36个体外混合样本和12/12个非混合样本,主要菌株比例的估计准确率很高(在真实菌株比例的3%以内),优于之前的工具。通过选取hSNP处频率最高的等位基因,重建的主要菌株序列与真实组成序列紧密匹配,而重建次要比例菌株序列表现最佳的方法是在同一群体中识别最接近的非混合分离株,不过当次要菌株比例为5%时,没有一种方法有效。最后,摩尔多瓦检测到的混合感染病例比之前估计的要少(6.6%对17.4%),并且我们发现了多个混合样本的组成菌株出现在传播簇中的实例。

结论

MixInfect2能从短读长WGS数据中准确检测出有混合感染证据的样本,并能很好地估计混合比例。虽然在重建混合样本的组成菌株序列方面存在局限性,但我们给出了将这些分离株纳入进一步分析的最佳方法建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3317/11771024/8f1063f4f2c3/13073_2025_1430_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验