Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:1848-1851. doi: 10.1109/EMBC46164.2021.9630370.
Cancer is an aggressive disease which imparts a tremendous socio-economic burden on the international community. Early detection is an important aspect in improving survival rates for cancer sufferers; however, very few studies have investigated the possibility of predicting which people have the highest risk to develop this disease, even years before the traditional symptoms first occur. In this paper, a dataset from a longitudinal study which was collected among 2291 70-year olds in Sweden has been analyzed to investigate the possibility for predicting 2-7 year cancer-specific mortality. A tailored ensemble model has been developed to tackle this highly imbalanced dataset. The performance with different feature subsets has been investigated to evaluate the impact that heterogeneous data sources may have on the overall model. While a full-features model shows an Area Under the ROC Curve (AUC-ROC) of 0.882, a feature subset which only includes demographics, self-report health and lifestyle data, and wearable dataset collected in free-living environments presents similar performance (AUC-ROC: 0.857). This analysis confirms the importance of wearable technology for providing unbiased health markers and suggests its possible use in the accurate prediction of 2-7 year cancer-related mortality in older adults.
癌症是一种侵袭性疾病,给国际社会带来了巨大的社会经济负担。早期发现是提高癌症患者生存率的一个重要方面;然而,很少有研究探讨预测哪些人有最高风险患上这种疾病的可能性,即使是在传统症状出现多年之前。在本文中,分析了在瑞典对 2291 名 70 岁老年人进行的一项纵向研究的数据,以探讨预测 2-7 年癌症特异性死亡率的可能性。开发了一个定制的集成模型来解决这个高度不平衡的数据集。研究了不同特征子集的性能,以评估异构数据源对整体模型的影响。虽然全特征模型的曲线下面积(AUC-ROC)为 0.882,但仅包括人口统计学、自我报告健康和生活方式数据以及在自由生活环境中收集的可穿戴设备数据的特征子集表现出类似的性能(AUC-ROC:0.857)。这项分析证实了可穿戴技术在提供无偏健康标志物方面的重要性,并表明其可能用于准确预测老年人 2-7 年与癌症相关的死亡率。