Suppr超能文献

利用蛋白质结构和化学特征通过机器学习预测利福平耐药性。

Predicting rifampicin resistance in using machine learning informed by protein structural and chemical features.

作者信息

Lynch Charlotte I, Adlard Dylan, Fowler Philip W

机构信息

Nuffield Department of Medicine, University of Oxford, Oxford, UK.

These authors contributed equally.

出版信息

ERJ Open Res. 2025 Jun 30;11(3). doi: 10.1183/23120541.00952-2024. eCollection 2025 May.

Abstract

BACKGROUND

Rifampicin remains a key antibiotic in the treatment of tuberculosis. Despite advances in cataloguing resistance-associated variants (RAVs), novel and rare mutations in the relevant gene, , will be encountered in clinical samples, complicating the task of using genetics to predict whether a sample is resistant or not to rifampicin. We have trained a series of machine learning models with the aim of complementing genetics-based drug susceptibility testing.

METHODS

We built a Test+Train dataset comprising 219 susceptible mutations and 46 RAVs. Features derived from the structure of the RNA polymerase or the change in chemistry introduced by the mutation were considered; however, only a few, notably the distance from the rifampicin binding site, were found to be predictive on their own. Owing to the paucity of RAVs we used Monte Carlo cross-validation with 50 repeats to train four different machine learning models.

RESULTS

All four models behaved similarly with sensitivities and specificities in the range 0.84-0.88 and 0.94-0.97, although we preferred the ensemble of decision tree models as they are easy to inspect and understand. We showed that measuring distances from molecular dynamics simulations did not improve performance.

CONCLUSIONS

It is possible to predict whether a mutation in confers resistance to rifampicin using a machine learning model trained on a combination of structural, chemical and evolutionary features; however, performance is moderate and training is complicated by the lack of data.

摘要

背景

利福平仍然是治疗结核病的关键抗生素。尽管在对耐药相关变异(RAV)进行编目方面取得了进展,但临床样本中仍会遇到相关基因中的新突变和罕见突变,这使得利用遗传学预测样本是否对利福平耐药的任务变得复杂。我们训练了一系列机器学习模型,旨在补充基于遗传学的药物敏感性测试。

方法

我们构建了一个测试+训练数据集,包含219个敏感突变和46个RAV。考虑了从RNA聚合酶结构或突变引入的化学变化中衍生的特征;然而,只有少数特征,特别是与利福平结合位点的距离,被发现自身具有预测性。由于RAV数量稀少,我们使用了50次重复的蒙特卡洛交叉验证来训练四种不同的机器学习模型。

结果

所有四个模型的表现相似,敏感性和特异性范围分别为0.84 - 0.88和0.94 - 0.97,尽管我们更喜欢决策树模型的集成,因为它们易于检查和理解。我们表明,从分子动力学模拟中测量距离并不能提高性能。

结论

使用基于结构、化学和进化特征组合训练的机器学习模型,可以预测某突变是否赋予对利福平的耐药性;然而,性能一般,且由于缺乏数据,训练过程较为复杂。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f86/12208608/e099f1fb2541/00952-2024.01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验