Jiang Yuhan, Wang Xu, Li Li, Wang Yifan, Wang Xuelin, Zou Yingxue
Tianjin Children's Hospital (Children's Hospital, Tianin University), Tianjin, China.
Clinical School of Pediatrics, Tianjin Medical University, Tianjin, China.
Sci Rep. 2025 May 23;15(1):18029. doi: 10.1038/s41598-025-02962-4.
In recent years, the incidence of refractory Mycoplasma pneumoniae pneumonia (RMPP) has significantly risen, posing severe pulmonary and extrapulmonary complications, making early identification a challenge for clinicians. In this retrospective single-center study, we included patients diagnosed with Mycoplasma pneumoniae pneumonia in 2021, categorizing them into RMPP and non-RMPP groups. Univariate regression analysis initially identified variables associated with RMPP. Seven mainstream machine learning methods were then employed to construct predictive models, evaluated for reliability and robustness through tenfold cross-validation and sensitivity analysis. Ultimately, the optimal predictive model was selected using multidimensional metric assessments, and SHAP analysis identified key predictive factors related to RMPP. Twenty-nine factors from various dimensions were found to be associated with RMPP and used to build the predictive model. The XGBoost model demonstrated high predictive capability with an accuracy of 0.80 and an AUC of 0.93. Ten-fold cross-validation and sensitivity analysis confirmed the model's robustness and reliability. SHAP analysis interpreted the final model with 8 key features. These features include fever duration, macrolide treatment before hospitalization, severe Mycoplasma pneumoniae pneumonia, lactate dehydrogenase, neutrophil-to-lymphocyte ratio, alanine aminotransferase, peak fever, and extensive lung consolidation. This simple, effective predictive model enhances clinicians' understanding and aids early identification of RMPP.
近年来,难治性支原体肺炎(RMPP)的发病率显著上升,引发严重的肺部和肺外并发症,这使得临床医生难以早期识别。在这项回顾性单中心研究中,我们纳入了2021年被诊断为支原体肺炎的患者,并将他们分为RMPP组和非RMPP组。单因素回归分析首先确定了与RMPP相关的变量。随后采用七种主流机器学习方法构建预测模型,并通过十折交叉验证和敏感性分析评估其可靠性和稳健性。最终,使用多维度指标评估选择了最佳预测模型,SHAP分析确定了与RMPP相关的关键预测因素。发现来自各个维度的29个因素与RMPP相关,并用于构建预测模型。XGBoost模型显示出较高的预测能力,准确率为0.80,AUC为0.93。十折交叉验证和敏感性分析证实了该模型的稳健性和可靠性。SHAP分析用8个关键特征解释了最终模型。这些特征包括发热持续时间、住院前大环内酯类治疗、重症支原体肺炎、乳酸脱氢酶、中性粒细胞与淋巴细胞比值、谷丙转氨酶、发热峰值和广泛的肺实变。这个简单有效的预测模型增强了临床医生的认识,并有助于早期识别RMPP。