Tran Tao Thi, Lee Jeonghee, Kim Junetae, Kim Sun-Young, Cho Hyunsoon, Kim Jeongseon
Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea.
Faculty of Public Health, University of Medicine and Pharmacy, Hue University, Hue city, Vietnam.
BMC Public Health. 2024 Dec 20;24(1):3549. doi: 10.1186/s12889-024-20852-8.
Given the rapid increase in the prevalence of prostate cancer (PCa), identifying its risk factors and developing suitable risk prediction models has important implications for public health. We used machine learning (ML) approach to screen participants with high risk of PCa and, specifically, investigated whether participants with metabolic syndrome (MetS) exhibited an elevated PCa risk.
A prospective cohort study was performed with 41,837 participants in South Korea. We predicted PCa based on MetS, its components, and sociodemographic factors using Cox proportional hazards and five ML models. Integrated Brier score (IBS) and C-index were used to assess model performance.
A total of 210 incident PCa cases were identified. We found good calibration and discrimination for all models (C-index ≥ 0.800 and IBS = 0.01). Importantly, performance increased after excluding MetS and its components from the models; the highest C-index was 0.862 for survival support vector machine. In contrast, first-degree family history of PCa, alcohol consumption, age, and income were valuable for PCa prediction.
ML models are an effective approach to develop prediction models for survival analysis. Furthermore, MetS and its components do not seem to influence PCa susceptibility, in contrast to first-degree family history of PCa, age, alcohol consumption, and income.
鉴于前列腺癌(PCa)患病率迅速上升,识别其风险因素并开发合适的风险预测模型对公共卫生具有重要意义。我们采用机器学习(ML)方法筛选PCa高风险参与者,具体研究代谢综合征(MetS)参与者的PCa风险是否升高。
对韩国41837名参与者进行了一项前瞻性队列研究。我们使用Cox比例风险模型和五种ML模型,基于MetS、其组成成分以及社会人口学因素预测PCa。采用综合Brier评分(IBS)和C指数评估模型性能。
共识别出210例PCa新发病例。我们发现所有模型均具有良好的校准和区分能力(C指数≥0.800且IBS = 0.01)。重要的是,从模型中排除MetS及其组成成分后性能有所提高;生存支持向量机的最高C指数为0.862。相比之下,PCa一级家族史、饮酒、年龄和收入对PCa预测很重要。
ML模型是开发生存分析预测模型的有效方法。此外,与PCa一级家族史、年龄、饮酒和收入不同,MetS及其组成成分似乎不影响PCa易感性。