Lam Chun Sing, Hua Rong, Loong Herbert Ho-Fung, Ngan Chun-Kit, Cheung Yin Ting
School of Pharmacy, Faculty of Medicine, Chinese University of Hong Kong, 8th Floor, Lo Kwee-Seong Integrated Biomedical Sciences Building, Area 39, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong, China, 852 39436833.
Department of Clinical Oncology, Faculty of Medicine, Chinese University of Hong Kong, Hong Kong, China.
JMIR Cancer. 2025 Jul 16;11:e71937. doi: 10.2196/71937.
Patients with cancer and cancer survivors often experience multiple chronic health conditions, which can impact symptom burden and treatment outcomes. Despite the high prevalence of multimorbidity, research on cancer prognosis has predominantly focused on cancers in isolation. There is growing interest in machine learning techniques for cancer studies. However, these methods have not been applied in the context of supportive care for patients with cancer who have multimorbidity. Furthermore, few studies have investigated the associations between comorbidity clusters and mortality outcomes.
This study investigated comorbidity clusters among patients with cancer using machine learning and examined their associations with mortality outcomes in two large representative samples from the United States and Hong Kong.
This study used data from the National Health and Nutrition Examination Survey (NHANES) and the Hospital Authority Data Collaboration Laboratory (HADCL). Participants aged ≥20 years with a history of cancer were included. The study used a two-step framework to identify clusters of comorbidities in NHANES. In the first step, we used four machine learning techniques, including the Bernoulli mixture model and partition-based methods, to cluster the comorbidities. In the second step, domain experts reviewed and ranked the identified clusters to ensure clinical relevance. The clusters that had the highest average rank were selected for further analysis. The associations between comorbidity clusters and mortality outcomes were analyzed using Cox proportional hazards models. We conducted an external validation to evaluate the generalizability of the clusters identified in the NHANES cohort and their associations with mortality using HADCL. The same number of clusters was replicated based on the distinctive patterns and distribution of comorbidities observed within each cluster.
The study included 4390 participants in NHANES and 12,484 participants in HADCL. Four comorbidity clusters were identified: low comorbidity, metabolic, cardiovascular disease (CVD), and respiratory. In NHANES, participants in the respiratory cluster had the highest risk of all-cause mortality (adjusted hazard ratio [aHR] 1.62, 95% CI 1.26-2.08; P<.001), followed by the CVD cluster (aHR 1.50, 95% CI 1.26-1.80; P<.001) compared to the low comorbidity cluster. The 3 clusters were associated with higher risks of CVD-related mortality (aHR 1.48-3.05, 95% CI 1.14-4.07; P<.003). The effects of comorbidity clusters on mortality were modified by income-to-poverty ratio (P for interaction=.04), diet quality (P for interaction=.02), and cancer prognosis (P for interaction=.005). In the HADCL (validation) cohort, participants in the respiratory and CVD clusters had a higher risk of all-cause mortality.
High comorbidity burden clusters showed increased all-cause and CVD-related mortality in patients with cancer. These findings highlight the significance of considering comorbidity burden in cancer care. Machine learning approaches can provide valuable insights into complex multimorbidity profiles. Further research is needed to deepen understanding of the relationships between multimorbidity and cancer-specific outcomes.
癌症患者及癌症幸存者常常患有多种慢性健康问题,这可能会影响症状负担和治疗结果。尽管多重疾病的患病率很高,但癌症预后的研究主要集中在单一癌症上。机器学习技术在癌症研究中的应用越来越受到关注。然而,这些方法尚未应用于患有多重疾病的癌症患者的支持性护理背景中。此外,很少有研究调查合并症集群与死亡率结果之间的关联。
本研究使用机器学习调查癌症患者中的合并症集群,并在来自美国和香港的两个大型代表性样本中检查它们与死亡率结果的关联。
本研究使用了来自美国国家健康与营养检查调查(NHANES)和医院管理局数据协作实验室(HADCL)的数据。纳入年龄≥20岁且有癌症病史的参与者。该研究使用两步框架来识别NHANES中的合并症集群。第一步,我们使用四种机器学习技术,包括伯努利混合模型和基于分区的方法,对合并症进行聚类。第二步,领域专家对识别出的集群进行审查和排名,以确保临床相关性。选择平均排名最高的集群进行进一步分析。使用Cox比例风险模型分析合并症集群与死亡率结果之间的关联。我们进行了外部验证,以评估在NHANES队列中识别出的集群的普遍性及其与使用HADCL的死亡率的关联。根据每个集群中观察到的合并症的独特模式和分布,复制相同数量的集群。
该研究包括NHANES中的4390名参与者和HADCL中的12484名参与者。识别出四个合并症集群:低合并症、代谢、心血管疾病(CVD)和呼吸系统。在NHANES中,与低合并症集群相比,呼吸系统集群的参与者全因死亡率风险最高(调整后风险比[aHR]1.62,95%置信区间1.26 - 2.08;P<.001),其次是CVD集群(aHR 1.50,95%置信区间1.26 - 1.80;P<.001)。这三个集群与CVD相关死亡率的较高风险相关(aHR 1.48 - 3.05,95%置信区间1.14 - 4.07;P<.003)。合并症集群对死亡率的影响因收入贫困比(交互作用P =.04)、饮食质量(交互作用P =.02)和癌症预后(交互作用P =.005)而有所改变。在HADCL(验证)队列中,呼吸系统和CVD集群的参与者全因死亡率风险较高。
高合并症负担集群在癌症患者中显示出全因死亡率和CVD相关死亡率增加。这些发现突出了在癌症护理中考虑合并症负担的重要性。机器学习方法可以为复杂的多重疾病概况提供有价值的见解。需要进一步研究以加深对多重疾病与癌症特异性结果之间关系的理解。