He Bing, Li Xuewen, Dong Rongrong, Yao Han, Zhou Qi, Xu Changyan, Shang Chengming, Zhao Bo, Zhou Huiling, Yu Xinqiao, Xu Jiancheng
Department of Laboratory Medicine, First Hospital of Jilin University, Changchun, 130021, China.
Department of Hematology, First Hospital of Jilin University, Changchun, 130021, China.
Sci Rep. 2025 Mar 19;15(1):9431. doi: 10.1038/s41598-025-92089-3.
Severe Mycoplasma pneumoniae pneumonia (SMPP) poses significant diagnostic challenges due to its clinical features overlapping with those of other common respiratory diseases. This study aims to develop and validate machine learning (ML) models for the early identification of SMPP and the risk prediction for liver and heart damage in SMPP using accessible laboratory indicators. Cohort 1 was divided into SMPP group and other respiratory diseases group. Cohort 2 was divided into myocardial damage, liver damage, and non-damage groups. The models built using five ML algorithms were compared to screen the best algorithm and model. Receiver Operating Characteristic (ROC) curves, accuracy, sensitivity, and other performance indicators were utilized to evaluate the performance of each model. Feature importance and Shapley Additive Explanation (SHAP) values were introduced to enhance the interpretability of models. Cohort 3 was used for external validation. In Cohort 1, the SMPP differential diagnostic model developed using the LightGBM algorithm achieved the highest performance with AUC = 0.975. In Cohort 2, the LightGBM model demonstrated superior performance in distinguishing myocardial damage, liver damage, and non-damage in SMPP patients (accuracy = 0.814). Feature importance and SHAP values indicated that ALT and CK-MB emerged as pivotal contributors significantly influencing Model 2's output magnitude. The diagnostic and predictive abilities of the ML models were validated in Cohort 3, demonstrating the models had some clinical generalizability. The Model 1 and Model 2 constructed by LightGBM algorithm showed excellent ability in differential diagnosis of SMPP and risk prediction of organ damage in children.
重症肺炎支原体肺炎(SMPP)因其临床特征与其他常见呼吸道疾病重叠而带来了重大的诊断挑战。本研究旨在开发并验证机器学习(ML)模型,以便利用可获取的实验室指标对SMPP进行早期识别,并对SMPP患者的肝脏和心脏损伤进行风险预测。队列1分为SMPP组和其他呼吸道疾病组。队列2分为心肌损伤组、肝损伤组和无损伤组。比较使用五种ML算法构建的模型,以筛选出最佳算法和模型。利用受试者操作特征(ROC)曲线、准确性、敏感性和其他性能指标来评估每个模型的性能。引入特征重要性和夏普利值(SHAP)以增强模型的可解释性。队列3用于外部验证。在队列1中,使用LightGBM算法开发的SMPP鉴别诊断模型性能最高,曲线下面积(AUC)=0.975。在队列2中,LightGBM模型在区分SMPP患者的心肌损伤、肝损伤和无损伤方面表现出卓越性能(准确性=0.814)。特征重要性和SHAP值表明,谷丙转氨酶(ALT)和肌酸激酶同工酶(CK-MB)是显著影响模型2输出大小的关键因素。ML模型的诊断和预测能力在队列3中得到验证,表明这些模型具有一定的临床通用性。由LightGBM算法构建的模型1和模型2在儿童SMPP的鉴别诊断和器官损伤风险预测方面表现出卓越能力。