• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于识别 COVID-19 基因生物标志物的可解释人工智能模型。

Explainable artificial intelligence model for identifying COVID-19 gene biomarkers.

机构信息

Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, 44280, Malatya, Turkey.

Software Engineering Department, King Hussein School for Computing Sciences, Amman, Jordan.

出版信息

Comput Biol Med. 2023 Mar;154:106619. doi: 10.1016/j.compbiomed.2023.106619. Epub 2023 Feb 1.

DOI:10.1016/j.compbiomed.2023.106619
PMID:36738712
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9889119/
Abstract

AIM

COVID-19 has revealed the need for fast and reliable methods to assist clinicians in diagnosing the disease. This article presents a model that applies explainable artificial intelligence (XAI) methods based on machine learning techniques on COVID-19 metagenomic next-generation sequencing (mNGS) samples.

METHODS

In the data set used in the study, there are 15,979 gene expressions of 234 patients with COVID-19 negative 141 (60.3%) and COVID-19 positive 93 (39.7%). The least absolute shrinkage and selection operator (LASSO) method was applied to select genes associated with COVID-19. Support Vector Machine - Synthetic Minority Oversampling Technique (SVM-SMOTE) method was used to handle the class imbalance problem. Logistics regression (LR), SVM, random forest (RF), and extreme gradient boosting (XGBoost) methods were constructed to predict COVID-19. An explainable approach based on local interpretable model-agnostic explanations (LIME) and SHAPley Additive exPlanations (SHAP) methods was applied to determine COVID-19- associated biomarker candidate genes and improve the final model's interpretability.

RESULTS

For the diagnosis of COVID-19, the XGBoost (accuracy: 0.930) model outperformed the RF (accuracy: 0.912), SVM (accuracy: 0.877), and LR (accuracy: 0.912) models. As a result of the SHAP, the three most important genes associated with COVID-19 were IFI27, LGR6, and FAM83A. The results of LIME showed that especially the high level of IFI27 gene expression contributed to increasing the probability of positive class.

CONCLUSIONS

The proposed model (XGBoost) was able to predict COVID-19 successfully. The results show that machine learning combined with LIME and SHAP can explain the biomarker prediction for COVID-19 and provide clinicians with an intuitive understanding and interpretability of the impact of risk factors in the model.

摘要

目的

COVID-19 凸显了需要快速可靠的方法来帮助临床医生诊断疾病。本文提出了一种模型,该模型应用基于机器学习技术的可解释人工智能 (XAI) 方法对 COVID-19 宏基因组下一代测序 (mNGS) 样本进行分析。

方法

在研究中使用的数据集包含 234 名 COVID-19 患者的 15979 个基因表达,其中 COVID-19 阴性患者 141 例(60.3%),COVID-19 阳性患者 93 例(39.7%)。应用最小绝对收缩和选择算子 (LASSO) 方法选择与 COVID-19 相关的基因。应用支持向量机-合成少数过采样技术 (SVM-SMOTE) 方法处理类别不平衡问题。构建逻辑回归 (LR)、支持向量机 (SVM)、随机森林 (RF) 和极端梯度提升 (XGBoost) 方法来预测 COVID-19。应用基于局部可解释模型无关解释 (LIME) 和 SHAPley 加性解释 (SHAP) 方法的可解释方法来确定与 COVID-19 相关的生物标志物候选基因,并提高最终模型的可解释性。

结果

对于 COVID-19 的诊断,XGBoost(准确性:0.930)模型优于 RF(准确性:0.912)、SVM(准确性:0.877)和 LR(准确性:0.912)模型。根据 SHAP 的结果,与 COVID-19 相关的三个最重要的基因是 IFI27、LGR6 和 FAM83A。LIME 的结果表明,IFI27 基因表达水平较高尤其有助于增加阳性类别的概率。

结论

所提出的模型(XGBoost)能够成功预测 COVID-19。结果表明,机器学习与 LIME 和 SHAP 相结合可以解释 COVID-19 的生物标志物预测,并为临床医生提供对模型中风险因素影响的直观理解和可解释性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/2e5c16d0db64/gr6_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/6ba4a5b6b04f/gr1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/daf12aa9b3da/gr2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/a77409ae894c/fx1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/249d3afe9ae2/fx2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/8d6905f9dd26/gr3_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/072852fc47e2/gr4_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/eb660d220389/gr5_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/2e5c16d0db64/gr6_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/6ba4a5b6b04f/gr1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/daf12aa9b3da/gr2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/a77409ae894c/fx1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/249d3afe9ae2/fx2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/8d6905f9dd26/gr3_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/072852fc47e2/gr4_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/eb660d220389/gr5_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8d/9889119/2e5c16d0db64/gr6_lrg.jpg

相似文献

1
Explainable artificial intelligence model for identifying COVID-19 gene biomarkers.用于识别 COVID-19 基因生物标志物的可解释人工智能模型。
Comput Biol Med. 2023 Mar;154:106619. doi: 10.1016/j.compbiomed.2023.106619. Epub 2023 Feb 1.
2
Investigation on explainable machine learning models to predict chronic kidney diseases.探究可解释机器学习模型在预测慢性肾脏病中的应用。
Sci Rep. 2024 Feb 14;14(1):3687. doi: 10.1038/s41598-024-54375-4.
3
An Explainable Artificial Intelligence Framework for the Deterioration Risk Prediction of Hepatitis Patients.用于预测肝炎患者恶化风险的可解释人工智能框架。
J Med Syst. 2021 Apr 13;45(5):61. doi: 10.1007/s10916-021-01736-5.
4
Beyond black-box models: explainable AI for embryo ploidy prediction and patient-centric consultation.超越黑箱模型:用于胚胎倍性预测和以患者为中心咨询的可解释人工智能
J Assist Reprod Genet. 2024 Sep;41(9):2349-2358. doi: 10.1007/s10815-024-03178-7. Epub 2024 Jul 4.
5
Machine learning-enabled prediction of prolonged length of stay in hospital after surgery for tuberculosis spondylitis patients with unbalanced data: a novel approach using explainable artificial intelligence (XAI).机器学习在数据不平衡的情况下预测脊柱结核手术后住院时间延长的预测:一种使用可解释人工智能 (XAI) 的新方法。
Eur J Med Res. 2024 Jul 25;29(1):383. doi: 10.1186/s40001-024-01988-0.
6
Model-agnostic explainable artificial intelligence tools for severity prediction and symptom analysis on Indian COVID-19 data.用于印度新冠疫情数据严重程度预测和症状分析的模型无关可解释人工智能工具。
Front Artif Intell. 2023 Dec 4;6:1272506. doi: 10.3389/frai.2023.1272506. eCollection 2023.
7
Explainable Machine Learning to Predict Successful Weaning Among Patients Requiring Prolonged Mechanical Ventilation: A Retrospective Cohort Study in Central Taiwan.可解释机器学习用于预测需要长期机械通气患者的成功撤机:台湾中部的一项回顾性队列研究
Front Med (Lausanne). 2021 Apr 23;8:663739. doi: 10.3389/fmed.2021.663739. eCollection 2021.
8
A Proactive Attack Detection for Heating, Ventilation, and Air Conditioning (HVAC) System Using Explainable Extreme Gradient Boosting Model (XGBoost).基于可解释极端梯度提升模型(XGBoost)的主动式 HVAC 系统攻击检测
Sensors (Basel). 2022 Nov 27;22(23):9235. doi: 10.3390/s22239235.
9
Explainable machine learning to predict long-term mortality in critically ill ventilated patients: a retrospective study in central Taiwan.可解释的机器学习用于预测重症通气患者的长期死亡率:台湾中部的一项回顾性研究
BMC Med Inform Decis Mak. 2022 Mar 25;22(1):75. doi: 10.1186/s12911-022-01817-6.
10
An Explainable AI Approach for the Rapid Diagnosis of COVID-19 Using Ensemble Learning Algorithms.一种使用集成学习算法快速诊断 COVID-19 的可解释人工智能方法。
Front Public Health. 2022 Jun 21;10:874455. doi: 10.3389/fpubh.2022.874455. eCollection 2022.

引用本文的文献

1
Identification of metabolomics-based biomarker discovery in individuals with down syndrome utilizing kernel-tree model-enhanced explainable artificial intelligence methodology.利用核树模型增强的可解释人工智能方法识别唐氏综合征个体中基于代谢组学的生物标志物。
Front Mol Biosci. 2025 Apr 9;12:1567199. doi: 10.3389/fmolb.2025.1567199. eCollection 2025.
2
Incorporation of explainable artificial intelligence in ensemble machine learning-driven pancreatic cancer diagnosis.将可解释人工智能整合到集成机器学习驱动的胰腺癌诊断中。
Sci Rep. 2025 Apr 23;15(1):14038. doi: 10.1038/s41598-025-98298-0.
3
Artificial intelligence optimizes the standardized diagnosis and treatment of chronic sinusitis.

本文引用的文献

1
LargeMetabo: an out-of-the-box tool for processing and analyzing large-scale metabolomic data.LargeMetabo:一款用于处理和分析大规模代谢组学数据的即用型工具。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac455.
2
Differential gene expression profiling reveals potential biomarkers and pharmacological compounds against SARS-CoV-2: Insights from machine learning and bioinformatics approaches.差异基因表达谱分析揭示了针对 SARS-CoV-2 的潜在生物标志物和药物化合物:机器学习和生物信息学方法的见解。
Front Immunol. 2022 Aug 17;13:918692. doi: 10.3389/fimmu.2022.918692. eCollection 2022.
3
A novel multi-class classification model for schizophrenia, bipolar disorder and healthy controls using comprehensive transcriptomic data.
人工智能优化慢性鼻窦炎的标准化诊断与治疗。
Front Physiol. 2025 Mar 13;16:1522090. doi: 10.3389/fphys.2025.1522090. eCollection 2025.
4
A machine learning-based model to predict intravenous immunoglobulin resistance in Kawasaki disease.一种基于机器学习的模型,用于预测川崎病中静脉注射免疫球蛋白的耐药性。
iScience. 2025 Feb 11;28(3):112004. doi: 10.1016/j.isci.2025.112004. eCollection 2025 Mar 21.
5
An interpretable machine learning-assisted diagnostic model for Kawasaki disease in children.一种用于儿童川崎病的可解释的机器学习辅助诊断模型。
Sci Rep. 2025 Mar 7;15(1):7927. doi: 10.1038/s41598-025-92277-1.
6
Current methods in explainable artificial intelligence and future prospects for integrative physiology.可解释人工智能的当前方法与整合生理学的未来前景。
Pflugers Arch. 2025 Apr;477(4):513-529. doi: 10.1007/s00424-025-03067-7. Epub 2025 Feb 25.
7
Demystifying the black box: A survey on explainable artificial intelligence (XAI) in bioinformatics.揭开黑箱之谜:生物信息学中可解释人工智能(XAI)的调查。
Comput Struct Biotechnol J. 2025 Jan 10;27:346-359. doi: 10.1016/j.csbj.2024.12.027. eCollection 2025.
8
A Genetic algorithm aided hyper parameter optimization based ensemble model for respiratory disease prediction with Explainable AI.一种基于遗传算法辅助超参数优化的集成模型,用于借助可解释人工智能进行呼吸系统疾病预测。
PLoS One. 2024 Dec 2;19(12):e0308015. doi: 10.1371/journal.pone.0308015. eCollection 2024.
9
Optimization of diagnosis and treatment of hematological diseases via artificial intelligence.通过人工智能优化血液疾病的诊断与治疗
Front Med (Lausanne). 2024 Nov 7;11:1487234. doi: 10.3389/fmed.2024.1487234. eCollection 2024.
10
Development and application of explainable artificial intelligence using machine learning classification for long-term facial nerve function after vestibular schwannoma surgery.基于机器学习分类的可解释人工智能在前庭神经鞘瘤手术后长期面神经功能中的开发与应用
J Neurooncol. 2025 Jan;171(1):165-177. doi: 10.1007/s11060-024-04844-7. Epub 2024 Oct 11.
利用全面转录组数据构建精神分裂症、双相情感障碍和健康对照的新型多分类模型。
Comput Biol Med. 2022 Sep;148:105956. doi: 10.1016/j.compbiomed.2022.105956. Epub 2022 Aug 12.
4
A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count.一种新颖的联合动态集成选择模型,用于从全血细胞计数中检测 COVID-19 ,以解决数据不平衡问题。
Comput Methods Programs Biomed. 2021 Nov;211:106444. doi: 10.1016/j.cmpb.2021.106444. Epub 2021 Sep 29.
5
Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP.基于机器学习和SHAP对冠心病所致心力衰竭患者3年全因死亡率的可解释预测
Comput Biol Med. 2021 Oct;137:104813. doi: 10.1016/j.compbiomed.2021.104813. Epub 2021 Aug 28.
6
A review of mathematical model-based scenario analysis and interventions for COVID-19.基于数学模型的 COVID-19 情景分析和干预措施综述。
Comput Methods Programs Biomed. 2021 Sep;209:106301. doi: 10.1016/j.cmpb.2021.106301. Epub 2021 Jul 27.
7
Enabling Artificial Intelligence for Genome Sequence Analysis of COVID-19 and Alike Viruses.为 COVID-19 及类似病毒的基因组序列分析启用人工智能。
Interdiscip Sci. 2022 Jun;14(2):504-519. doi: 10.1007/s12539-021-00465-0. Epub 2021 Aug 6.
8
Artificial intelligence-driven assessment of radiological images for COVID-19.人工智能驱动的 COVID-19 放射影像评估。
Comput Biol Med. 2021 Sep;136:104665. doi: 10.1016/j.compbiomed.2021.104665. Epub 2021 Jul 21.
9
Healthcare strategies and initiatives about COVID19 in Pakistan: Telemedicine a way to look forward.巴基斯坦关于新冠疫情的医疗保健策略与举措:远程医疗是未来的发展方向。
Comput Methods Programs Biomed Update. 2021;1:100008. doi: 10.1016/j.cmpbup.2021.100008. Epub 2021 Apr 6.
10
Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks.使用X射线图像和深度卷积神经网络自动检测冠状病毒病(COVID-19)。
Pattern Anal Appl. 2021;24(3):1207-1220. doi: 10.1007/s10044-021-00984-y. Epub 2021 May 9.