利用真实世界的电子健康记录数据预测多病共存情况下12种癌症相关症状的发生发展。

Using real-world electronic health record data to predict the development of 12 cancer-related symptoms in the context of multimorbidity.

作者信息

Bandyopadhyay Anindita, Albashayreh Alaa, Zeinali Nahid, Fan Weiguo, Gilbertson-White Stephanie

机构信息

Department of Business Analytics, University of Iowa, Iowa City, IA 52242, United States.

College of Nursing, University of Iowa, Iowa City, IA 52242, United States.

出版信息

JAMIA Open. 2024 Sep 12;7(3):ooae082. doi: 10.1093/jamiaopen/ooae082. eCollection 2024 Oct.

DOI:10.1093/jamiaopen/ooae082

PMID:39282082

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11397936/

Abstract

OBJECTIVE

This study uses electronic health record (EHR) data to predict 12 common cancer symptoms, assessing the efficacy of machine learning (ML) models in identifying symptom influencers.

MATERIALS AND METHODS

We analyzed EHR data of 8156 adults diagnosed with cancer who underwent cancer treatment from 2017 to 2020. Structured and unstructured EHR data were sourced from the Enterprise Data Warehouse for Research at the University of Iowa Hospital and Clinics. Several predictive models, including logistic regression, random forest (RF), and XGBoost, were employed to forecast symptom development. The performances of the models were evaluated by F1-score and area under the curve (AUC) on the testing set. The SHapley Additive exPlanations framework was used to interpret these models and identify the predictive risk factors associated with fatigue as an exemplar.

RESULTS

The RF model exhibited superior performance with a macro average AUC of 0.755 and an F1-score of 0.729 in predicting a range of cancer-related symptoms. For instance, the RF model achieved an AUC of 0.954 and an F1-score of 0.914 for pain prediction. Key predictive factors identified included clinical history, cancer characteristics, treatment modalities, and patient demographics depending on the symptom. For example, the odds ratio (OR) for fatigue was significantly influenced by allergy (OR = 2.3, 95% CI: 1.8-2.9) and colitis (OR = 1.9, 95% CI: 1.5-2.4).

DISCUSSION

Our research emphasizes the critical integration of multimorbidity and patient characteristics in modeling cancer symptoms, revealing the considerable influence of chronic conditions beyond cancer itself.

CONCLUSION

We highlight the potential of ML for predicting cancer symptoms, suggesting a pathway for integrating such models into clinical systems to enhance personalized care and symptom management.

摘要

目的

本研究使用电子健康记录（EHR）数据预测12种常见癌症症状，评估机器学习（ML）模型在识别症状影响因素方面的有效性。

材料与方法

我们分析了2017年至2020年接受癌症治疗的8156名成年癌症患者的EHR数据。结构化和非结构化EHR数据来源于爱荷华大学医院和诊所的企业研究数据仓库。采用了几种预测模型，包括逻辑回归、随机森林（RF）和XGBoost，来预测症状的发展。通过测试集上的F1分数和曲线下面积（AUC）评估模型的性能。使用SHapley加法解释框架来解释这些模型，并将与疲劳相关的预测风险因素作为示例进行识别。

结果

RF模型在预测一系列癌症相关症状方面表现出卓越性能，宏观平均AUC为0.755，F1分数为0.729。例如，RF模型在疼痛预测方面的AUC为0.954，F1分数为0.914。确定的关键预测因素包括临床病史、癌症特征、治疗方式以及取决于症状的患者人口统计学特征。例如，过敏（优势比[OR]=2.3，95%置信区间[CI]：1.8 - 2.9）和结肠炎（OR = 1.9，95% CI：1.5 - 2.4）对疲劳的优势比有显著影响。

讨论

我们的研究强调了在对癌症症状进行建模时多病症和患者特征的关键整合，揭示了慢性病在癌症本身之外的重大影响。

结论

我们强调了ML在预测癌症症状方面的潜力，提出了将此类模型整合到临床系统中以加强个性化护理和症状管理的途径。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用真实世界的电子健康记录数据预测多病共存情况下12种癌症相关症状的发生发展。

Using real-world electronic health record data to predict the development of 12 cancer-related symptoms in the context of multimorbidity.

作者信息

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目的

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

利用真实世界的电子健康记录数据预测多病共存情况下12种癌症相关症状的发生发展。

Using real-world electronic health record data to predict the development of 12 cancer-related symptoms in the context of multimorbidity.

作者信息

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目的

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献