一种用于癌症幸存者胃肠道健康的机器学习分类模型:端粒长度和健康的社会决定因素的作用。

A Machine Learning Classification Model for Gastrointestinal Health in Cancer Survivors: Roles of Telomere Length and Social Determinants of Health.

作者信息

Han Claire J, Ning Xia, Burd Christin E, Tounkara Fode, Kalady Matthew F, Noonan Anne M, Von Ah Diane

机构信息

Center for Healthy Aging, Self-Management and Complex Care, College of Nursing, The Ohio State University, Columbus, OH 43210, USA.

The James: Cancer Treatment and Research Center, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Int J Environ Res Public Health. 2024 Dec 19;21(12):1694. doi: 10.3390/ijerph21121694.

Abstract

BACKGROUND

Gastrointestinal (GI) distress is prevalent and often persistent among cancer survivors, impacting their quality of life, nutrition, daily function, and mortality. GI health screening is crucial for preventing and managing this distress. However, accurate classification methods for GI health remain unexplored. We aimed to develop machine learning (ML) models to classify GI health status (better vs. worse) by incorporating biological aging and social determinants of health (SDOH) indicators in cancer survivors.

METHODS

We included 645 adult cancer survivors from the 1999-2002 NHANES survey. Using training and test datasets, we employed six ML models to classify GI health conditions (better vs. worse). These models incorporated leukocyte telomere length (TL), SDOH, and demographic/clinical data.

RESULTS

Among the ML models, the random forest (RF) performed the best, achieving a high area under the curve (AUC = 0.98) in the training dataset. The gradient boosting machine (GBM) demonstrated excellent classification performance with a high AUC (0.80) in the test dataset. TL, several socio-economic factors, cancer risk behaviors (including lifestyle choices), and inflammatory markers were associated with GI health. The most significant input features for better GI health in our ML models were longer TL and an annual household income above the poverty level, followed by routine physical activity, low white blood cell counts, and food security.

CONCLUSIONS

Our findings provide valuable insights into classifying and identifying risk factors related to GI health, including biological aging and SDOH indicators. To enhance model predictability, further longitudinal studies and external clinical validations are necessary.

摘要

背景

胃肠道(GI)不适在癌症幸存者中普遍存在且常常持续存在,影响他们的生活质量、营养状况、日常功能和死亡率。胃肠道健康筛查对于预防和管理这种不适至关重要。然而,胃肠道健康的准确分类方法仍未得到探索。我们旨在开发机器学习(ML)模型,通过纳入癌症幸存者的生物衰老和健康的社会决定因素(SDOH)指标来对胃肠道健康状况(较好与较差)进行分类。

方法

我们纳入了1999 - 2002年美国国家健康与营养检查调查(NHANES)中的645名成年癌症幸存者。使用训练和测试数据集,我们采用六种机器学习模型对胃肠道健康状况(较好与较差)进行分类。这些模型纳入了白细胞端粒长度(TL)、SDOH以及人口统计学/临床数据。

结果

在机器学习模型中,随机森林(RF)表现最佳,在训练数据集中实现了较高的曲线下面积(AUC = 0.98)。梯度提升机(GBM)在测试数据集中具有较高的AUC(0.80),展示了出色的分类性能。端粒长度、几个社会经济因素、癌症风险行为(包括生活方式选择)和炎症标志物与胃肠道健康相关。在我们的机器学习模型中,胃肠道健康状况较好的最重要输入特征是较长的端粒长度和家庭年收入高于贫困线,其次是日常体育活动、低白细胞计数和食品安全。

结论

我们的研究结果为分类和识别与胃肠道健康相关的风险因素提供了有价值的见解,包括生物衰老和SDOH指标。为了提高模型的可预测性,进一步的纵向研究和外部临床验证是必要的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0731/11675289/4153f9585d97/ijerph-21-01694-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索