School of Medical Informatics, Chongqing Medical University, Chongqing, China.
Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing, China.
Front Public Health. 2021 Dec 24;9:800549. doi: 10.3389/fpubh.2021.800549. eCollection 2021.
The etiology of fever of unknown origin (FUO) is complex and remains a major challenge for clinicians. This study aims to investigate the distribution of the etiology of classic FUO and the differences in clinical indicators in patients with different etiologies of classic FUO and to establish a machine learning (ML) model based on clinical data. The clinical data and final diagnosis results of 527 patients with classic FUO admitted to 7 medical institutions in Chongqing from January 2012 to August 2021 and who met the classic FUO diagnostic criteria were collected. Three hundred seventy-three patients with final diagnosis were divided into 4 groups according to 4 different etiological types of classical FUO, and statistical analysis was carried out to screen out the indicators with statistical differences under different etiological types. On the basis of these indicators, five kinds of ML models, i.e., random forest (RF), support vector machine (SVM), Light Gradient Boosting Machine (LightGBM), artificial neural network (ANN), and naive Bayes (NB) models, were used to evaluate all datasets using 5-fold cross-validation, and the performance of the models were evaluated using micro-F1 scores. The 373 patients were divided into the infectious disease group ( = 277), non-infectious inflammatory disease group ( = 51), neoplastic disease group ( = 31), and other diseases group ( = 14) according to 4 different etiological types. Another 154 patients were classified as undetermined group because the cause of fever was still unclear at discharge. There were significant differences in gender, age, and 18 other indicators among the four groups of patients with classic FUO with different etiological types ( < 0.05). The micro-F1 score for LightGBM was 75.8%, which was higher than that for the other four ML models, and the LightGBM prediction model had the best performance. Infectious diseases are still the main etiological type of classic FUO. Based on 18 statistically significant clinical indicators such as gender and age, we constructed and evaluated five ML models. LightGBM model has a good effect on predicting the etiological type of classic FUO, which will play a good auxiliary decision-making function.
发热待查(FUO)的病因复杂,仍是临床医生面临的一大挑战。本研究旨在探讨经典 FUO 的病因分布及不同病因经典 FUO 患者临床指标的差异,并基于临床数据建立机器学习(ML)模型。
收集 2012 年 1 月至 2021 年 8 月重庆 7 家医疗机构收治的符合经典 FUO 诊断标准的 527 例经典 FUO 患者的临床资料和最终诊断结果。将 373 例有最终诊断的患者按经典 FUO 的 4 种不同病因类型分为 4 组,对不同病因类型下有统计学差异的指标进行统计学分析。在这些指标的基础上,采用随机森林(RF)、支持向量机(SVM)、Light Gradient Boosting Machine(LightGBM)、人工神经网络(ANN)和朴素贝叶斯(NB)等 5 种 ML 模型,对所有数据集进行 5 折交叉验证,并使用微 F1 评分评估模型性能。
将 373 例患者根据 4 种不同病因类型分为感染性疾病组(=277)、非感染性炎症性疾病组(=51)、肿瘤性疾病组(=31)和其他疾病组(=14),另有 154 例患者因出院时发热原因仍未明确而被归类为未确定组。不同病因类型的经典 FUO 患者在性别、年龄及 18 项其他指标上差异均有统计学意义(<0.05)。LightGBM 的微 F1 评分为 75.8%,高于其他 4 种 ML 模型,LightGBM 预测模型的性能最好。
感染性疾病仍是经典 FUO 的主要病因类型。基于性别和年龄等 18 个具有统计学意义的临床指标,我们构建并评估了 5 种 ML 模型。LightGBM 模型对经典 FUO 的病因类型预测效果较好,可发挥较好的辅助决策作用。