Suppr超能文献

机器学习技术在结核病耐药性分析中的应用。

Application of machine learning techniques to tuberculosis drug resistance analysis.

机构信息

Department of Engineering Science, Institute of Biomedical Engineering.

Nuffield Department of Medicine, University of Oxford.

出版信息

Bioinformatics. 2019 Jul 1;35(13):2276-2282. doi: 10.1093/bioinformatics/bty949.

Abstract

MOTIVATION

Timely identification of Mycobacterium tuberculosis (MTB) resistance to existing drugs is vital to decrease mortality and prevent the amplification of existing antibiotic resistance. Machine learning methods have been widely applied for timely predicting resistance of MTB given a specific drug and identifying resistance markers. However, they have been not validated on a large cohort of MTB samples from multi-centers across the world in terms of resistance prediction and resistance marker identification. Several machine learning classifiers and linear dimension reduction techniques were developed and compared for a cohort of 13 402 isolates collected from 16 countries across 6 continents and tested 11 drugs.

RESULTS

Compared to conventional molecular diagnostic test, area under curve of the best machine learning classifier increased for all drugs especially by 23.11%, 15.22% and 10.14% for pyrazinamide, ciprofloxacin and ofloxacin, respectively (P < 0.01). Logistic regression and gradient tree boosting found to perform better than other techniques. Moreover, logistic regression/gradient tree boosting with a sparse principal component analysis/non-negative matrix factorization step compared with the classifier alone enhanced the best performance in terms of F1-score by 12.54%, 4.61%, 7.45% and 9.58% for amikacin, moxifloxacin, ofloxacin and capreomycin, respectively, as well increasing area under curve for amikacin and capreomycin. Results provided a comprehensive comparison of various techniques and confirmed the application of machine learning for better prediction of the large diverse tuberculosis data. Furthermore, mutation ranking showed the possibility of finding new resistance/susceptible markers.

AVAILABILITY AND IMPLEMENTATION

The source code can be found at http://www.robots.ox.ac.uk/ davidc/code.php.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

及时识别结核分枝杆菌(MTB)对现有药物的耐药性对于降低死亡率和防止现有抗生素耐药性的扩大至关重要。机器学习方法已广泛应用于及时预测 MTB 对特定药物的耐药性,并识别耐药标记物。然而,它们尚未在全球多个中心的大量 MTB 样本中进行耐药性预测和耐药标记物识别的验证。我们开发并比较了几种机器学习分类器和线性降维技术,用于来自六大洲 16 个国家的 13402 个分离株的队列,这些分离株测试了 11 种药物。

结果

与传统的分子诊断测试相比,最佳机器学习分类器的曲线下面积(AUC)增加了所有药物,尤其是吡嗪酰胺、环丙沙星和氧氟沙星的 AUC 分别增加了 23.11%、15.22%和 10.14%(P<0.01)。逻辑回归和梯度提升树被发现比其他技术表现更好。此外,与仅使用分类器相比,逻辑回归/梯度提升树加上稀疏主成分分析/非负矩阵分解步骤可以分别将阿米卡星、莫西沙星、氧氟沙星和卷曲霉素的 F1 分数最佳性能提高 12.54%、4.61%、7.45%和 9.58%,并提高阿米卡星和卷曲霉素的 AUC。结果提供了对各种技术的全面比较,并证实了机器学习在更好地预测大型多样化结核病数据方面的应用。此外,突变排名显示了发现新的耐药/敏感标记物的可能性。

可用性和实现

源代码可在 http://www.robots.ox.ac.uk/davidc/code.php 找到。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/191e/6596891/c01eaca3914a/bty949f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验