Campbell Amy Marie, Hauton Chris, van Aerle Ronny, Martinez-Urtaza Jaime
School of Ocean and Earth Science, University of Southampton, Southampton, United Kingdom.
Centre for Environment, Fisheries and Aquaculture Science (CEFAS), Weymouth, United Kingdom.
JMIR Bioinform Biotechnol. 2024 Nov 28;5:e62747. doi: 10.2196/62747.
Environmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions.
The unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion.
The global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics.
We identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance.
These novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.
对环境敏感的病原体对气候变化表现出生态和进化反应,导致适应性良好的变体出现并在全球范围内传播。了解促进病原体出现和传播的机制以及这些机制背后的驱动因素,对于理解和应对未来大流行的传播至关重要。
副溶血性弧菌(一种引起肠胃炎感染的海洋细菌)的克隆复合体副溶血性弧菌序列型3(VpST3)独特、迅速的全球传播,为探索病原体传播的生态进化驱动因素提供了契机。
利用VpST3基因组重建了VpST3的全球传播情况,然后将其分类为表征这一传播过程各阶段的指标,这些指标指示了出现和定殖的阶段。我们使用机器学习,特别是随机森林分类器,来测试一系列生态和进化驱动因素在预测VpST3传播动态方面的潜力。
我们确定了一系列与传播动态相关的进化特征,包括核心基因组中的突变和辅助基因的存在。测试了一系列随机森林分类器方法,以预测每个基因组的传播分类指标。使用生态进化综合方法的模型实现了最高的预测准确率(范围从0.722到0.967)。虽然种群结构以及引入菌株和定殖菌株之间的差异能够被高精度地预测,但我们的模型在预测引入菌株的成功时报告了多个误报,这表明我们的生态进化特征中没有体现出潜在的限制因素。针对报告VpST3基因组最多的两个国家建立的区域模型取得了不同程度的成功,这反映了类别不平衡的影响。
这些关于与VpST3传播阶段相关的进化特征和生态条件的新见解,展示了使用基因组数据的机器学习模型的潜力,并将有助于未来对气候敏感病原体的生态进化途径的理解。