Department of Respiratory Medicine, Hannover Medical School (MHH), Hannover, Germany; German Center for Lung Research (DZL), Giessen, Germany.
Department of Respiratory Medicine, Hannover Medical School (MHH), Hannover, Germany.
Int J Infect Dis. 2021 Mar;104:398-406. doi: 10.1016/j.ijid.2021.01.003. Epub 2021 Jan 11.
Administrative claims data are prone to underestimate the burden of non-tuberculous mycobacterial pulmonary disease (NTM-PD).
We developed machine learning-based algorithms using historical claims data from cases with NTM-PD to predict patients with a high probability of having previously undiagnosed NTM-PD and to assess actual prevalence and incidence. Adults with incident NTM-PD were classified from a representative 5% sample of the German population covered by statutory health insurance during 2011-2016 by the International Classification of Diseases, 10th revision code A31.0. Pre-diagnosis characteristics (patient demographics, comorbidities, diagnostic and therapeutic procedures, and medications) were extracted and compared to that of a control group without NTM-PD to identify risk factors.
Applying a random forest model (area under the curve 0.847; total error 19.4%) and a risk threshold of >99%, prevalence and incidence rates in 2016 increased 5-fold and 9-fold to 19 and 15 cases/100,000 population, respectively, for both coded and non-coded vs. coded cases alone.
The use of a machine learning-based algorithm applied to German statutory health insurance claims data predicted a considerable number of previously unreported NTM-PD cases with high probabilty.
行政索赔数据容易低估非结核分枝杆菌肺病(NTM-PD)的负担。
我们使用来自 NTM-PD 病例的历史索赔数据开发了基于机器学习的算法,以预测具有先前未确诊 NTM-PD 的高概率患者,并评估实际的患病率和发病率。2011 年至 2016 年间,通过国际疾病分类第 10 版代码 A31.0,从德国法定健康保险覆盖的代表性 5%的人群中对患有新发 NTM-PD 的成年人进行分类。提取发病前特征(患者人口统计学、合并症、诊断和治疗程序以及药物),并与无 NTM-PD 的对照组进行比较,以确定危险因素。
应用随机森林模型(曲线下面积 0.847;总误差 19.4%)和风险阈值>99%,2016 年的患病率和发病率分别增加了 5 倍和 9 倍,编码和非编码病例的患病率和发病率分别为 19 例和 15 例/10 万人口,而编码病例的患病率和发病率分别为 3.8 例和 3.1 例/10 万人口。
使用基于机器学习的算法应用于德国法定健康保险索赔数据预测了大量以前未报告的 NTM-PD 病例,具有较高的可能性。