Lau Kitty Yu-Yeung, Ng Kei-Shing, Kwok Ka-Wai, Tsia Kevin Kin-Man, Sin Chun-Fung, Lam Ching-Wan, Vardhanabhuti Varut
Biomedical Engineering Programme, The University of Hong Kong, Hong Kong, Hong Kong SAR, China.
Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong SAR, China.
Front Med (Lausanne). 2022 Feb 24;8:764934. doi: 10.3389/fmed.2021.764934. eCollection 2021.
To better understand the different clinical phenotypes across the disease spectrum in patients with COVID-19 using an unsupervised machine learning clustering approach.
A population-based retrospective study was conducted utilizing demographics, clinical characteristics, comorbidities, and clinical outcomes of 7,606 COVID-19-positive patients on admission to public hospitals in Hong Kong in the year 2020. An unsupervised machine learning clustering was used to explore this large cohort.
Four clusters of differing clinical phenotypes based on data at initial admission was derived in which 86.6% of the deceased cases were aggregated in one of the clusters without prior knowledge of their clinical outcomes. Other distinctive clinical characteristics of this cluster were old age and high concurrent comorbidities as well as laboratory characteristics of lower hemoglobin/hematocrit levels, higher neutrophil, C-reactive protein, lactate dehydrogenase, and creatinine levels. The clinical patterns captured by the cluster analysis was validated on other temporally distinct cohorts in 2021. The phenotypes aligned with existing literature.
The study demonstrated the usefulness of unsupervised machine learning techniques with the potential to uncover latent clinical phenotypes. It could serve as a more robust classification for patient triaging and patient-tailored treatment strategies.
使用无监督机器学习聚类方法,以更好地了解新冠病毒病(COVID-19)患者疾病谱中的不同临床表型。
进行了一项基于人群的回顾性研究,利用2020年香港公立医院收治的7606例COVID-19阳性患者的人口统计学、临床特征、合并症和临床结局。采用无监督机器学习聚类方法对这一大型队列进行探索。
根据初始入院时的数据得出了四个不同临床表型的聚类,其中86.6%的死亡病例聚集在其中一个聚类中,且事先不知道其临床结局。该聚类的其他显著临床特征包括老年、高并发合并症以及血红蛋白/血细胞比容水平较低、中性粒细胞、C反应蛋白、乳酸脱氢酶和肌酐水平较高的实验室特征。聚类分析所捕获的临床模式在2021年其他时间上不同的队列中得到了验证。这些表型与现有文献一致。
该研究证明了无监督机器学习技术的有用性,其有可能揭示潜在的临床表型。它可为患者分诊和个性化治疗策略提供更可靠的分类。