Karlafti Eleni, Anagnostis Athanasios, Kotzakioulafi Evangelia, Vittoraki Michaela Chrysanthi, Eufraimidou Ariadni, Kasarjyan Kristine, Eufraimidou Katerina, Dimitriadou Georgia, Kakanis Chrisovalantis, Anthopoulos Michail, Kaiafa Georgia, Savopoulos Christos, Didangelos Triantafyllos
First Propaedeutic Department of Internal Medicine, Aristotle University of Thessaloniki, AHEPA University Hospital of Thessaloniki, 54621 Thessaloniki, Greece.
Emergency Department, AHEPA University Hospital, Aristotle University of Thessaloniki, 54621 Thessaloniki, Greece.
J Pers Med. 2021 Dec 17;11(12):1380. doi: 10.3390/jpm11121380.
Since the beginning of the COVID-19 pandemic, 195 million people have been infected and 4.2 million have died from the disease or its side effects. Physicians, healthcare scientists and medical staff continuously try to deal with overloaded hospital admissions, while in parallel, they try to identify meaningful correlations between the severity of infected patients with their symptoms, comorbidities and biomarkers. Artificial intelligence (AI) and machine learning (ML) have been used recently in many areas related to COVID-19 healthcare. The main goal is to manage effectively the wide variety of issues related to COVID-19 and its consequences. The existing applications of ML to COVID-19 healthcare are based on supervised classifications which require a labeled training dataset, serving as reference point for learning, as well as predefined classes. However, the existing knowledge about COVID-19 and its consequences is still not solid and the points of common agreement among different scientific communities are still unclear. Therefore, this study aimed to follow an unsupervised clustering approach, where prior knowledge is not required (tabula rasa). More specifically, 268 hospitalized patients at the First Propaedeutic Department of Internal Medicine of AHEPA University Hospital of Thessaloniki were assessed in terms of 40 clinical variables (numerical and categorical), leading to a high-dimensionality dataset. Dimensionality reduction was performed by applying a principal component analysis (PCA) on the numerical part of the dataset and a multiple correspondence analysis (MCA) on the categorical part of the dataset. Then, the Bayesian information criterion (BIC) was applied to Gaussian mixture models (GMM) in order to identify the optimal number of clusters under which the best grouping of patients occurs. The proposed methodology identified four clusters of patients with similar clinical characteristics. The analysis revealed a cluster of asymptomatic patients that resulted in death at a rate of 23.8%. This striking result forces us to reconsider the relationship between the severity of COVID-19 clinical symptoms and the patient's mortality.
自新冠疫情开始以来,已有1.95亿人感染,420万人死于该疾病或其副作用。医生、医疗保健科学家和医护人员不断努力应对超负荷的住院情况,与此同时,他们试图找出感染患者的严重程度与其症状、合并症和生物标志物之间的有意义关联。人工智能(AI)和机器学习(ML)最近已应用于许多与新冠医疗保健相关的领域。主要目标是有效管理与新冠疫情及其后果相关的各种问题。ML在新冠医疗保健方面的现有应用基于监督分类,这需要一个有标签的训练数据集作为学习的参考点以及预定义的类别。然而,关于新冠疫情及其后果的现有知识仍然不够坚实,不同科学界之间的共识点仍不明确。因此,本研究旨在采用一种无监督聚类方法,即无需先验知识(白板状态)。更具体地说,对塞萨洛尼基AHEPA大学医院内科第一预诊部的268名住院患者进行了40项临床变量(数值型和分类型)的评估,从而得到一个高维数据集。通过对数据集的数值部分应用主成分分析(PCA)以及对数据集的分类部分应用多重对应分析(MCA)来进行降维。然后,将贝叶斯信息准则(BIC)应用于高斯混合模型(GMM),以确定能使患者得到最佳分组的最优聚类数。所提出的方法识别出了具有相似临床特征的四类患者。分析揭示了一类无症状患者,其死亡率为23.8%。这一惊人结果迫使我们重新思考新冠临床症状的严重程度与患者死亡率之间的关系。