The Bioinformatics Group, School of Water, Energy and Environment, Cranfield University, Bedford, UK.
University of Natural Resources and Life Sciences, Vienna, Austria.
PLoS Comput Biol. 2020 Dec 21;16(12):e1008518. doi: 10.1371/journal.pcbi.1008518. eCollection 2020 Dec.
Tuberculosis disease is a major global public health concern and the growing prevalence of drug-resistant Mycobacterium tuberculosis is making disease control more difficult. However, the increasing application of whole-genome sequencing as a diagnostic tool is leading to the profiling of drug resistance to inform clinical practice and treatment decision making. Computational approaches for identifying established and novel resistance-conferring mutations in genomic data include genome-wide association study (GWAS) methodologies, tests for convergent evolution and machine learning techniques. These methods may be confounded by extensive co-occurrent resistance, where statistical models for a drug include unrelated mutations known to be causing resistance to other drugs. Here, we introduce a novel 'cannibalistic' elimination algorithm ("Hungry, Hungry SNPos") that attempts to remove these co-occurrent resistant variants. Using an M. tuberculosis genomic dataset for the virulent Beijing strain-type (n = 3,574) with phenotypic resistance data across five drugs (isoniazid, rifampicin, ethambutol, pyrazinamide, and streptomycin), we demonstrate that this new approach is considerably more robust than traditional methods and detects resistance-associated variants too rare to be likely picked up by correlation-based techniques like GWAS.
结核病是一个主要的全球公共卫生关注点,而不断增加的耐多药结核分枝杆菌的流行使得疾病控制更加困难。然而,全基因组测序作为一种诊断工具的应用越来越广泛,正在对耐药性进行分析,以为临床实践和治疗决策提供信息。在基因组数据中识别已建立和新的耐药相关突变的计算方法包括全基因组关联研究(GWAS)方法、趋同进化测试和机器学习技术。这些方法可能会受到广泛共存耐药性的干扰,其中针对一种药物的统计模型包括已知导致其他药物耐药的无关突变。在这里,我们引入了一种新颖的“自相残杀”消除算法(“饥饿,饥饿 SNPos”),试图去除这些共存的耐药变体。使用一个包含 3574 例有表型耐药数据的毒力较强的北京基因型分枝杆菌基因组数据集,对五种药物(异烟肼、利福平、乙胺丁醇、吡嗪酰胺和链霉素)进行了检测,我们证明这种新方法比传统方法更稳健,并且能够检测到相关性技术(如 GWAS)可能难以发现的罕见耐药相关变异。