Medical Research Center, State Key laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Beijing, 100730, China.
Department of Clinical Laboratory, State Key laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, 100730, China.
Microb Biotechnol. 2021 Jul;14(4):1539-1549. doi: 10.1111/1751-7915.13815. Epub 2021 May 21.
Non-tuberculous mycobacteria (NTM) can cause various respiratory diseases and even death in severe cases, and its incidence has increased rapidly worldwide. To date, it's difficult to use routine diagnostic methods and strain identification to precisely diagnose various types of NTM infections. We combined systematic comparative genomics with machine learning to select new diagnostic markers for precisely identifying five common pathogenic NTMs (Mycobacterium kansasii, Mycobacterium avium, Mycobacterium intracellular, Mycobacterium chelonae, Mycobacterium abscessus). A panel including six genes and two SNPs (nikA, benM, codA, pfkA2, mpr, yjcH, rrl C2638T, rrl A1173G) was selected to simultaneously identify the five NTMs with high accuracy (> 90%). Notably, the panel only containing the six genes also showed a good classification effect (accuracy > 90%). Additionally, the two panels could precisely differentiate the five NTMs from M. tuberculosis (accuracy > 99%). We also revealed some new marker genes/SNPs/combinations to accurately discriminate any one of the five NTMs separately, which provided the possibility to diagnose one certain NTM infection precisely. Our research not only reveals novel promising diagnostic markers to promote the development of precision diagnosis in NTM infectious, but also provides an insight into precisely identifying various genetically close pathogens through comparative genomics and machine learning.
非结核分枝杆菌(NTM)可引起各种呼吸道疾病,严重情况下甚至导致死亡,其发病率在全球范围内迅速上升。迄今为止,常规诊断方法和菌株鉴定很难准确诊断各种类型的 NTM 感染。我们结合系统比较基因组学和机器学习,选择了新的诊断标志物,以准确识别五种常见的致病性 NTM(堪萨斯分枝杆菌、鸟分枝杆菌、胞内分枝杆菌、溃疡分枝杆菌和脓肿分枝杆菌)。选择了包括六个基因和两个 SNP(nikA、benM、codA、pfkA2、mpr、yjcH、rrl C2638T、rrl A1173G)的一组基因,可同时高精度(>90%)识别这五种 NTM。值得注意的是,仅包含六个基因的组也表现出良好的分类效果(准确性>90%)。此外,这两个组还可以精确地区分这五种 NTM 和结核分枝杆菌(准确性>99%)。我们还揭示了一些新的标记基因/SNP/组合,可以准确区分这五种 NTM 中的任何一种,这为精确诊断一种特定的 NTM 感染提供了可能。我们的研究不仅揭示了新的有前途的诊断标志物,以促进 NTM 感染精准诊断的发展,还通过比较基因组学和机器学习深入了解了如何精确识别各种遗传上密切相关的病原体。