Suppr超能文献

基于症状利用极端梯度提升(XGBoost)和沙普利值加法解释(Shapley Additive Explanations)方法检测猴痘病例

Detection of Monkeypox Cases Based on Symptoms Using XGBoost and Shapley Additive Explanations Methods.

作者信息

Farzipour Alireza, Elmi Roya, Nasiri Hamid

机构信息

Department of Computer Science, Semnan University, Semnan 35131-19111, Iran.

Farzanegan Campus, Semnan University, Semnan 35197-34851, Iran.

出版信息

Diagnostics (Basel). 2023 Jul 17;13(14):2391. doi: 10.3390/diagnostics13142391.

Abstract

The monkeypox virus poses a novel public health risk that might quickly escalate into a worldwide epidemic. Machine learning (ML) has recently shown much promise in diagnosing diseases like cancer, finding tumor cells, and finding COVID-19 patients. In this study, we have created a dataset based on the data both collected and published by Global Health and used by the World Health Organization (WHO). Being entirely textual, this dataset shows the relationship between the symptoms and the monkeypox disease. The data have been analyzed, using gradient boosting methods such as Extreme Gradient Boosting (XGBoost), CatBoost, and LightGBM along with other standard machine learning methods such as Support Vector Machine (SVM) and Random Forest. All these methods have been compared. The research aims to provide an ML model based on symptoms for the diagnosis of monkeypox. Previous studies have only examined disease diagnosis using images. The best performance has belonged to XGBoost, with an accuracy of 1.0 in reviews. To check the model's flexibility, k-fold cross-validation is used, reaching an average accuracy of 0.9 in 5 different splits of the test set. In addition, Shapley Additive Explanations (SHAP) helps in examining and explaining the output of the XGBoost model.

摘要

猴痘病毒构成了一种新的公共卫生风险,可能迅速升级为全球大流行。机器学习(ML)最近在诊断癌症等疾病、发现肿瘤细胞以及发现新冠肺炎患者方面显示出了很大的前景。在本研究中,我们基于全球卫生组织收集并发布、世界卫生组织(WHO)使用的数据创建了一个数据集。该数据集完全是文本形式的,展示了症状与猴痘疾病之间的关系。我们使用了梯度提升方法,如极端梯度提升(XGBoost)、CatBoost和LightGBM,以及其他标准机器学习方法,如支持向量机(SVM)和随机森林,对数据进行了分析,并对所有这些方法进行了比较。该研究旨在提供一种基于症状的用于诊断猴痘的机器学习模型。以前的研究仅使用图像检查疾病诊断。在评估中,表现最佳的是XGBoost,准确率为1.0。为了检验模型的灵活性,使用了k折交叉验证,在测试集的5种不同划分中平均准确率达到0.9。此外,夏普利值加法解释(SHAP)有助于检查和解释XGBoost模型的输出。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f14/10378557/c013a9e466ca/diagnostics-13-02391-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验