Department of Pediatrics, the Affiliated Changsha Central Hospital, Hengyang Medical School, University of South China, Changsha, Hunan, China.
Department of Joint Surgery, he Hong-he Affiliated Hospital of Kunming Medical University/The Southern Central Hospital of Yun-nan Province (The First People's Hospital of Honghe State), Changsha, Hunan, China.
PeerJ. 2024 Mar 26;12:e17164. doi: 10.7717/peerj.17164. eCollection 2024.
This study aimed to create a predictive model based on machine learning to identify the risk for tracheobronchial tuberculosis (TBTB) occurring alongside pneumonia in pediatric patients.
Clinical data from 212 pediatric patients were examined in this retrospective analysis. This cohort included 42 individuals diagnosed with TBTB and pneumonia (combined group) and 170 patients diagnosed with lobar pneumonia alone (pneumonia group). Three predictive models, namely XGBoost, decision tree, and logistic regression, were constructed, and their performances were assessed using the receiver's operating characteristic (ROC) curve, precision-recall curve (PR), and decision curve analysis (DCA). The dataset was divided into a 7:3 ratio to test the first and second groups, utilizing them to validate the XGBoost model and to construct the nomogram model.
The XGBoost highlighted eight significant signatures, while the decision tree and logistic regression models identified six and five signatures, respectively. The ROC analysis revealed an area under the curve (AUC) of 0.996 for XGBoost, significantly outperforming the other models ( < 0.05). Similarly, the PR curve demonstrated the superior predictive capability of XGBoost. DCA further confirmed that XGBoost offered the highest AIC (43.226), the highest average net benefit (0.764), and the best model fit. Validation efforts confirmed the robustness of the findings, with the validation groups 1 and 2 showing ROC and PR curves with AUC of 0.997, indicating a high net benefit. The nomogram model was shown to possess significant clinical value.
Compared to machine learning approaches, the XGBoost model demonstrated superior predictive efficacy in identifying pediatric patients at risk of concurrent TBTB and pneumonia. The model's identification of critical signatures provides valuable insights into the pathogenesis of these conditions.
本研究旨在创建基于机器学习的预测模型,以识别儿科患者同时发生气管支气管结核(TBTB)和肺炎的风险。
本回顾性分析纳入了 212 名儿科患者的临床数据。该队列包括 42 名同时诊断为 TBTB 和肺炎(联合组)和 170 名单独诊断为大叶性肺炎(肺炎组)的患者。构建了三种预测模型,即 XGBoost、决策树和逻辑回归,并使用受试者工作特征(ROC)曲线、精确-召回(PR)曲线和决策曲线分析(DCA)评估它们的性能。将数据集分为 7:3 的比例来测试第一组和第二组,使用它们来验证 XGBoost 模型并构建列线图模型。
XGBoost 突出了八个显著特征,而决策树和逻辑回归模型分别识别了六个和五个特征。ROC 分析显示 XGBoost 的曲线下面积(AUC)为 0.996,明显优于其他模型(<0.05)。同样,PR 曲线显示了 XGBoost 的卓越预测能力。DCA 进一步证实 XGBoost 提供了最高的 AIC(43.226)、最高的平均净效益(0.764)和最佳的模型拟合。验证工作证实了研究结果的稳健性,验证组 1 和 2 的 ROC 和 PR 曲线 AUC 为 0.997,表明净效益较高。列线图模型被证明具有重要的临床价值。
与机器学习方法相比,XGBoost 模型在识别儿科患者同时发生 TBTB 和肺炎的风险方面表现出卓越的预测效果。该模型对关键特征的识别为这些疾病的发病机制提供了有价值的见解。