Wang Shuyi, Zhao Chunjiang, Yin Yuyao, Chen Fengning, Chen Hongbin, Wang Hui
Institute of Medical Technology, Peking University Health Science Center, Beijing, China.
Department of Clinical Laboratory, Peking University People's Hospital, Beijing, China.
Front Microbiol. 2022 Mar 2;13:841289. doi: 10.3389/fmicb.2022.841289. eCollection 2022.
With the reduction in sequencing price and acceleration of sequencing speed, it is particularly important to directly link the genotype and phenotype of bacteria. Here, we firstly predicted the minimum inhibitory concentrations of ten antimicrobial agents for using 466 isolates by directly extracting k-mer from whole genome sequencing data combined with three machine learning algorithms: random forest, support vector machine, and XGBoost. Considering one two-fold dilution, the essential agreement and the category agreement could reach >85% and >90% for most antimicrobial agents. For clindamycin, cefoxitin and trimethoprim-sulfamethoxazole, the essential agreement and the category agreement could reach >91% and >93%, providing important information for clinical treatment. The successful prediction of cefoxitin resistance showed that the model could identify methicillin-resistant . The results suggest that small datasets available in large hospitals could bypass the existing basic research and known antimicrobial resistance genes and accurately predict the bacterial phenotype.
随着测序价格的降低和测序速度的加快,直接关联细菌的基因型和表型变得尤为重要。在此,我们首先通过从全基因组测序数据中直接提取k-mer,并结合随机森林、支持向量机和XGBoost这三种机器学习算法,预测了466株菌株对十种抗菌药物的最低抑菌浓度。考虑到两倍稀释,大多数抗菌药物的基本一致性和类别一致性可分别达到>85%和>90%。对于克林霉素、头孢西丁和甲氧苄啶-磺胺甲恶唑,基本一致性和类别一致性可分别达到>91%和>93%,为临床治疗提供了重要信息。头孢西丁耐药性的成功预测表明该模型可以识别耐甲氧西林的[细菌名称未给出]。结果表明,大型医院现有的小数据集可以绕过现有的基础研究和已知的抗菌耐药基因,准确预测细菌表型。