Suppr超能文献

使用可解释机器学习模型从真实世界数据中对川崎病进行智能诊断。

Intelligent diagnosis of Kawasaki disease from real-world data using interpretable machine learning models.

作者信息

Duan Yifan, Wang Ruiqi, Huang Zhilin, Chen Haoran, Tang Mingkun, Zhou Jiayin, Hu Zhengyong, Hu Wanfei, Chen Zhenli, Qian Qing, Wang Haolin

机构信息

Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100020, PR China.

College of Medical Informatics, Chongqing Medical University, Chongqing 400016, PR China.

出版信息

Hellenic J Cardiol. 2025 Jan-Feb;81:38-48. doi: 10.1016/j.hjc.2024.08.003. Epub 2024 Aug 10.

Abstract

OBJECTIVE

This study aimed to leverage real-world electronic medical record data to develop interpretable machine learning models for diagnosis of Kawasaki disease while also exploring and prioritizing the significant risk factors.

METHODS

A comprehensive study was conducted on 4087 pediatric patients at the Children's Hospital of Chongqing, China. The study collected demographic data, physical examination results, and laboratory findings. Statistical analyses were performed using IBM SPSS Statistics, Version 26.0. The optimal feature subset was used to develop intelligent diagnostic prediction models based on the Light Gradient Boosting Machine, Explainable Boosting Machine (EBM), Gradient Boosting Classifier (GBC), Fast Interpretable Greedy-Tree Sums, Decision Tree, AdaBoost Classifier, and Logistic Regression. Model performance was evaluated in three dimensions: discriminative ability via receiver operating characteristic curves, calibration accuracy using calibration curves, and interpretability through SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanations).

RESULTS

In this study, Kawasaki disease was diagnosed in 2971 participants. Analysis was conducted on 31 indicators, including red blood cell distribution width and erythrocyte sedimentation rate. The EBM model demonstrated superior performance relative to other models, with an area under the curve of 0.97, second only to the GBC model. Furthermore, the EBM model exhibited the highest calibration accuracy and maintained its interpretability without relying on external analytical tools such as SHAP and LIME, thus reducing interpretation biases. Platelet distribution width, total protein, and erythrocyte sedimentation rate were identified by the model as significant predictors for the diagnosis of Kawasaki disease.

CONCLUSION

This study used diverse machine learning models for early diagnosis of Kawasaki disease. The findings demonstrated that interpretable models such as EBM outperformed traditional machine learning models in terms of both interpretability and performance. Ensuring consistency between predictive models and clinical evidence is crucial for the successful integration of artificial intelligence into real-world clinical practice.

摘要

目的

本研究旨在利用真实世界的电子病历数据,开发可解释的机器学习模型用于川崎病的诊断,同时探索并确定重要的风险因素并排出优先级。

方法

对中国重庆儿童医院的4087名儿科患者进行了一项全面研究。该研究收集了人口统计学数据、体格检查结果和实验室检查结果。使用IBM SPSS Statistics 26.0进行统计分析。使用最优特征子集,基于轻量级梯度提升机、可解释提升机(EBM)、梯度提升分类器(GBC)、快速可解释贪婪树总和、决策树、AdaBoost分类器和逻辑回归开发智能诊断预测模型。从三个维度评估模型性能:通过受试者工作特征曲线评估判别能力,使用校准曲线评估校准准确性,通过SHAP(夏普力值加法解释)和LIME(局部可解释模型无关解释)评估可解释性。

结果

在本研究中,2971名参与者被诊断为川崎病。对包括红细胞分布宽度和红细胞沉降率在内的31项指标进行了分析。EBM模型相对于其他模型表现出卓越的性能,曲线下面积为0.97,仅次于GBC模型。此外,EBM模型表现出最高的校准准确性,并且在不依赖SHAP和LIME等外部分析工具的情况下保持其可解释性,从而减少了解释偏差。该模型将血小板分布宽度、总蛋白和红细胞沉降率确定为川崎病诊断的重要预测指标。

结论

本研究使用多种机器学习模型对川崎病进行早期诊断。研究结果表明,EBM等可解释模型在可解释性和性能方面均优于传统机器学习模型。确保预测模型与临床证据之间的一致性对于人工智能成功融入实际临床实践至关重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验