Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America.
Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, United States of America.
PLoS One. 2018 Nov 20;13(11):e0207491. doi: 10.1371/journal.pone.0207491. eCollection 2018.
Tuberculosis is a major cause of morbidity and mortality in the developing world. Drug resistance, which is predicted to rise in many countries worldwide, threatens tuberculosis treatment and control.
To identify features associated with treatment failure and to predict which patients are at highest risk of treatment failure.
On a multi-country dataset managed by the National Institute of Allergy and Infectious Diseases we applied various machine learning techniques to identify factors statistically associated with treatment failure and to predict treatment failure based on baseline demographic and clinical characteristics alone.
The complete-case analysis database consisted of 587 patients (68% males) with a median (p25-p75) age of 40 (30-51) years. Treatment failure occurred in approximately one fourth of the patients. The features most associated with treatment failure were patterns of drug sensitivity, imaging findings, findings in the microscopy Ziehl-Nielsen stain, education status, and employment status. The most predictive model was forward stepwise selection (AUC: 0.74), although most models performed at or above AUC 0.7. A sensitivity analysis using the 643 original patients filling the missing values with multiple imputation showed similar predictive features and generally increased predictive performance.
Machine learning can help to identify patients at higher risk of treatment failure. Closer monitoring of these patients may decrease treatment failure rates and prevent emergence of antibiotic resistance. The use of inexpensive basic demographic and clinical features makes this approach attractive in low and middle-income countries.
结核病是发展中国家发病率和死亡率的主要原因。预计在世界许多国家,耐药性将会上升,这对结核病的治疗和控制构成了威胁。
确定与治疗失败相关的特征,并预测哪些患者最有可能治疗失败。
我们在国家过敏和传染病研究所管理的多国数据集上应用了各种机器学习技术,以确定与治疗失败相关的统计学因素,并仅根据基线人口统计学和临床特征预测治疗失败。
完整病例分析数据库包括 587 名(68%为男性)患者,中位数(p25-p75)年龄为 40(30-51)岁。大约四分之一的患者治疗失败。与治疗失败最相关的特征是药物敏感性模式、影像学表现、显微镜 Ziehl-Nielsen 染色结果、教育程度和就业状况。最具预测性的模型是逐步向前选择(AUC:0.74),尽管大多数模型的 AUC 均在 0.7 或以上。使用多重插补填补缺失值的 643 名原始患者的敏感性分析显示出相似的预测特征,并普遍提高了预测性能。
机器学习可以帮助识别治疗失败风险较高的患者。对这些患者进行更密切的监测可以降低治疗失败率,并防止抗生素耐药性的出现。这种方法使用廉价的基本人口统计学和临床特征,在中低收入国家具有吸引力。