Department of Civil and Environmental Engineering, University of Nebraska-Lincoln, 900N. 16th St, W150D Nebraska Hall, Lincoln, NE 68588-0531, United States.
Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE 68583, United States.
Water Res. 2021 Sep 1;202:117384. doi: 10.1016/j.watres.2021.117384. Epub 2021 Jun 26.
While the microbiome of activated sludge (AS) in wastewater treatment plants (WWTPs) plays a vital role in shaping the resistome, identifying the potential bacterial hosts of antibiotic resistance genes (ARGs) in WWTPs remains challenging. The objective of this study is to explore the feasibility of using a machine learning approach, random forests (RF's), to identify the strength of associations between ARGs and bacterial taxa in metagenomic datasets from the activated sludge of WWTPs. Our results show that the abundance of select ARGs can be predicted by RF's using abundant genera (Candidatus Accumulibacter, Dechloromonas, Pesudomonas, and Thauera, etc.), (opportunistic) pathogens and indicators (Bacteroides, Clostridium, and Streptococcus, etc.), and nitrifiers (Nitrosomonas and Nitrospira, etc.) as explanatory variables. The correlations between predicted and observed abundance of ARGs (erm(B), tet(O), tet(Q), etc.) ranged from medium (0.400 < R < 0.600) to strong (R > 0.600) when validated on testing datasets. Compared to those belonging to the other two groups, individual genera in the group of (opportunistic) pathogens and indicator bacteria had more positive functional relationships with select ARGs, suggesting genera in this group (e.g., Bacteroides, Clostridium, and Streptococcus) may be hosts of select ARGs. Furthermore, RF's with (opportunistic) pathogens and indicators as explanatory variables were used to predict the abundance of select ARGs in a full-scale WWTP successfully. Machine learning approaches such as RF's can potentially identify bacterial hosts of ARGs and reveal possible functional relationships between the ARGs and microbial community in the AS of WWTPs.
尽管废水处理厂(WWTP)中活性污泥(AS)的微生物组在塑造抗药基因(ARGs)抗性方面起着至关重要的作用,但识别 WWTP 中 ARGs 的潜在细菌宿主仍然具有挑战性。本研究旨在探索使用机器学习方法(随机森林(RF))来识别 WWTP 活性污泥宏基因组数据集中 ARGs 与细菌分类群之间关联强度的可行性。我们的结果表明,使用丰富的属(如 Candidatus Accumulibacter、Dechloromonas、Pesudomonas 和 Thauera 等)、(机会性)病原体和指示菌(如 Bacteroides、Clostridium 和 Streptococcus 等)以及硝化菌(如 Nitrosomonas 和 Nitrospira 等)作为解释变量,RF 可以预测选择的 ARGs(如 erm(B)、tet(O)、tet(Q) 等)的丰度。当在测试数据集上验证时,ARGs(如 erm(B)、tet(O)、tet(Q) 等)的预测丰度与观测丰度之间的相关性在中等(0.400 < R < 0.600)到强(R > 0.600)之间。与属于其他两组的属相比,机会性病原体和指示菌组中的单个属与选择的 ARGs 具有更多的正功能关系,这表明该组中的属(如 Bacteroides、Clostridium 和 Streptococcus)可能是选择的 ARGs 的宿主。此外,使用包含机会性病原体和指示物的 RF 成功地预测了一个全规模 WWTP 中选择的 ARGs 的丰度。RF 等机器学习方法可以潜在地识别 ARGs 的细菌宿主,并揭示 WWTP 中 AS 中 ARGs 与微生物群落之间的可能功能关系。