Suppr超能文献

优化机器学习预测. 的最低抑菌浓度

Optimising machine learning prediction of minimum inhibitory concentrations in .

机构信息

Department of Biology and Biotechnology, University of Pavia, Pavia, Italy.

MRC Centre for Global Infectious Disease Analysis, Imperial College, London, England, UK.

出版信息

Microb Genom. 2024 Mar;10(3). doi: 10.1099/mgen.0.001222.

Abstract

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.

摘要

最低抑菌浓度(MICs)是定量测量抗生素耐药性的金标准。然而,实验室基础的 MIC 测定可能既耗时又重复性低,并且对敏感或耐药的解释依赖于随时间变化的指南。基因组测序和机器学习有望允许 MIC 预测作为一种替代方法来克服这些困难,尽管仍然需要解释 MIC。然而,在处理预测模型时,我们应该如何精确处理 MIC 数据仍然不清楚,因为它们是半定量测量的,分辨率不同,并且在不同范围内通常也是左截断和右截断的。因此,我们使用 4367 个具有模拟半定量特征和真实 MIC 的基因组来研究病原体中基于基因组的 MIC 预测。由于我们专注于临床解释,因此我们使用可解释的而不是黑盒机器学习模型,即弹性网络、随机森林和线性混合模型。模拟特征是根据具有不同遗传率的寡基因、多基因和同形遗传效应生成的。然后,我们评估了当 MIC 被视为回归和分类时,模型预测准确性会受到怎样的影响。我们的研究结果表明,根据抗生素可用的浓度水平的数量来处理 MIC 是最有前途的学习策略。具体来说,为了优化预测准确性和推断正确的因果变体,我们建议在观察到的抗生素浓度水平数量较大时,将 MIC 视为连续变量,并将学习问题框定为回归问题,而当浓度水平数量较小时,应将其视为分类变量,并将学习问题框定为分类问题。我们的研究结果还强调了在考虑先验生物学知识时,预测模型如何得到改善,因为每个抗生素耐药性特征的遗传结构都不同。最后,我们强调,增加人群数据库对于未来这些模型在临床中的实施以支持基于常规机器学习的诊断至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/10995625/e864e3d69eb7/mgen-10-01222-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验