Duan Mengyu, Geng Zhimin, Gao Lichao, Zhao Yonggen, Li Zheming, Chen Lindong, Kuosmanen Pekka, Qi Guoqiang, Gong Fangqi, Yu Gang
National Clinical Research Center for Child Health, National Children's Regional Medical Center, Children's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
Sino-Finland Joint AI Laboratory for Child Health of Zhejiang Province, Hangzhou, China.
Sci Rep. 2025 Mar 7;15(1):7927. doi: 10.1038/s41598-025-92277-1.
Kawasaki disease (KD) is a syndrome of acute systemic vasculitis commonly observed in children. Due to its unclear pathogenesis and the lack of specific diagnostic markers, it is prone to being confused with other diseases that exhibit similar symptoms, making early and accurate diagnosis challenging. This study aimed to develop an interpretable machine learning (ML) diagnostic model for KD. We collected demographic and laboratory data from 3650 patients (2299 with KD, 1351 with similar symptoms but different diseases) and employed 10 ML algorithms to construct the diagnostic model. Diagnostic performance was evaluated using several metrics, including area under the receiver-operating characteristic curve (AUC). Additionally, the shapley additive explanations (SHAP) method was employed to select important features and explain the final model. Using the Streamlit framework, we converted the model into a user-friendly web application to enhance its practicality in clinical settings. Among the 10 ML algorithms, XGBoost demonstrates the best diagnostic performance, achieving an AUC of 0.9833. SHAP analysis revealed that features, including age in months, fibrinogen, and human interferon gamma, are important for diagnosis. When relying on the top 10 most important features, the model's AUC remains at 0.9757. The proposed model can assist clinicians in making early and accurate diagnoses of KD. Furthermore, its interpretability enhances model transparency, facilitating clinicians' understanding of prediction reliability.
川崎病(KD)是一种常见于儿童的急性全身性血管炎综合征。由于其发病机制尚不清楚且缺乏特异性诊断标志物,它容易与其他表现出相似症状的疾病相混淆,使得早期准确诊断具有挑战性。本研究旨在开发一种可解释的川崎病机器学习(ML)诊断模型。我们收集了3650例患者(2299例川崎病患者,1351例有相似症状但疾病不同的患者)的人口统计学和实验室数据,并采用10种机器学习算法构建诊断模型。使用包括受试者操作特征曲线下面积(AUC)在内的多个指标评估诊断性能。此外,采用夏普利值附加解释(SHAP)方法选择重要特征并解释最终模型。使用Streamlit框架,我们将该模型转换为用户友好的Web应用程序,以提高其在临床环境中的实用性。在10种机器学习算法中,XGBoost表现出最佳的诊断性能,AUC达到0.9833。SHAP分析表明,月龄、纤维蛋白原和人干扰素γ等特征对诊断很重要。依靠前10个最重要的特征时,该模型的AUC仍为0.9757。所提出的模型可以帮助临床医生对川崎病进行早期准确诊断。此外,其可解释性提高了模型的透明度,便于临床医生理解预测的可靠性。