Kim Younhun, Worby Colin J, Acharya Sawal, van Dijk Lucas R, Alfonsetti Daniel, Gromko Zackary, Azimzadeh Philippe N, Dodson Karen W, Gerber Georg K, Hultgren Scott J, Earl Ashlee M, Berger Bonnie, Gibson Travis E
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA.
Nat Microbiol. 2025 May;10(5):1184-1197. doi: 10.1038/s41564-025-01983-z. Epub 2025 May 6.
The ability to detect and quantify microbiota over time from shotgun metagenomic data has a plethora of clinical, basic science and public health applications. Given these applications, and the observation that pathogens and other taxa of interest can reside at low relative abundance, there is a critical need for algorithms that accurately profile low-abundance microbial taxa with strain-level resolution. Here we present ChronoStrain: a sequence quality- and time-aware Bayesian model for profiling strains in longitudinal samples. ChronoStrain explicitly models the presence or absence of each strain and produces a probability distribution over abundance trajectories for each strain. Using synthetic and semi-synthetic data, we demonstrate how ChronoStrain outperforms existing methods in abundance estimation and presence/absence prediction. Applying ChronoStrain to two human microbiome datasets demonstrated its improved interpretability for profiling Escherichia coli strain blooms in longitudinal faecal samples from adult women with recurring urinary tract infections, and its improved accuracy for detecting Enterococcus faecalis strains in infant faecal samples. Compared with state-of-the-art methods, ChronoStrain's ability to detect low-abundance taxa is particularly stark.
随着时间的推移,从鸟枪法宏基因组数据中检测和量化微生物群的能力在临床、基础科学和公共卫生等诸多领域都有应用。鉴于这些应用,以及病原体和其他感兴趣的分类群可能以低相对丰度存在的观察结果,迫切需要能够以菌株水平分辨率准确描绘低丰度微生物分类群的算法。在此,我们展示了ChronoStrain:一种用于纵向样本中菌株分析的序列质量和时间感知贝叶斯模型。ChronoStrain明确地对每个菌株的存在与否进行建模,并为每个菌株生成丰度轨迹的概率分布。使用合成数据和半合成数据,我们展示了ChronoStrain在丰度估计和存在/不存在预测方面如何优于现有方法。将ChronoStrain应用于两个人类微生物组数据集,证明了它在分析成年复发性尿路感染女性纵向粪便样本中大肠杆菌菌株增殖方面具有更好的可解释性,以及在检测婴儿粪便样本中粪肠球菌菌株方面具有更高的准确性。与最先进方法相比,ChronoStrain检测低丰度分类群的能力尤为突出。