School of Computer and Information Technology, Beijing Jiaotong University, China.
Hubei Provincial Hospital of Traditional Chinese Medicine, China.
Int J Med Inform. 2024 Nov;191:105555. doi: 10.1016/j.ijmedinf.2024.105555. Epub 2024 Jul 20.
Symptoms are significant kind of phenotypes for managing and controlling of the burst of acute infectious diseases, such as COVID-19. Although patterns of symptom clusters and time series have been considered the high potential prediction factors for the prognosis of patients, the elaborated subtypes and their progression patterns based on symptom phenotypes related to the prognosis of COVID-19 patients still need be detected. This study aims to investigate patient subtypes and their progression patterns with distinct features of outcome and prognosis.
This study included a total of 14,139 longitudinal electronic medical records (EMRs) obtained from four hospitals in Hubei Province, China, involving 2,683 individuals in the early stage of COVID-19 pandemic. A deep representation learning model was developed to help acquire the symptom profiles of patients. K-means clustering algorithm is used to divide them into distinct subtypes. Subsequently, symptom progression patterns were identified by considering the subtypes associated with patients upon admission and discharge. Furthermore, we used Fisher's test to identify significant clinical entities for each subtype.
Three distinct patient subtypes exhibiting specific symptoms and prognosis have been identified. Particularly, Subtype 0 includes 44.2% of the whole and is characterized by poor appetite, fatigue and sleep disorders; Subtype 1 includes 25.6% cases and is characterized by confusion, cough with bloody sputum, encopresis and urinary incontinence; Subtype 2 includes 30.2% cases and is characterized by dry cough and rhinorrhea. These three subtypes demonstrate significant disparities in prognosis, with the mortality rates of 4.72%, 8.59%, and 0.25% respectively. Furthermore, symptom cluster progression patterns showed that patients with Subtype 0 who manifest dark yellow urine, chest pain, etc. in the admission stage exhibit an elevated risk of transforming into the more severe subtypes with poor outcome, whereas those presenting with nausea and vomiting tend to incline towards entering the milder subtype.
This study has proposed a clinical meaningful approach by utilizing the deep representation learning and real-world EMR data containing symptom phenotypes to identify the COVID-19 subtypes and their progression patterns. The results would be potentially useful to help improve the precise stratification and management of acute infectious diseases.
症状是管理和控制急性传染病(如 COVID-19)爆发的重要表型。虽然症状群模式和时间序列已被认为是预测患者预后的高潜力因素,但基于与 COVID-19 患者预后相关的症状表型,仍需要检测详细的亚型及其进展模式。本研究旨在探讨具有不同结局和预后特征的患者亚型及其进展模式。
本研究共纳入来自中国湖北省四家医院的 14139 例纵向电子病历(EMR),涉及 COVID-19 大流行早期的 2683 人。开发了一种深度表示学习模型来帮助获取患者的症状特征。使用 K-means 聚类算法将它们分为不同的亚型。随后,通过考虑入院和出院时与患者相关的亚型来确定症状进展模式。此外,我们使用 Fisher 检验来识别每个亚型的显著临床实体。
确定了具有特定症状和预后的三种不同的患者亚型。特别是,0 型包括 44.2%的整体人群,表现为食欲不振、疲劳和睡眠障碍;1 型包括 25.6%的病例,表现为意识障碍、咯血、便秘和尿失禁;2 型包括 30.2%的病例,表现为干咳和流鼻涕。这三种亚型在预后方面存在显著差异,死亡率分别为 4.72%、8.59%和 0.25%。此外,症状群进展模式表明,入院阶段出现深黄色尿液、胸痛等症状的 0 型患者,转化为预后较差的更严重亚型的风险增加,而出现恶心和呕吐的患者则倾向于进入预后较轻的亚型。
本研究利用深度表示学习和包含症状表型的真实世界 EMR 数据,提出了一种有临床意义的方法来识别 COVID-19 亚型及其进展模式。研究结果可能有助于改善急性传染病的精确分层和管理。