School of Public Health, University of São Paulo, São Paulo, Brazil.
Carlos Chagas Institute, Oswaldo Cruz Foundation, Curitiba, Brazil.
Age Ageing. 2021 Sep 11;50(5):1692-1698. doi: 10.1093/ageing/afab067.
Populational ageing has been increasing in a remarkable rate in developing countries. In this scenario, preventive strategies could help to decrease the burden of higher demands for healthcare services. Machine learning algorithms have been increasingly applied for identifying priority candidates for preventive actions, presenting a better predictive performance than traditional parsimonious models.
Data were collected from the Health, Well Being and Aging (SABE) Study, a representative sample of older residents of São Paulo, Brazil. Machine learning algorithms were applied to predict death by diseases of respiratory system (DRS), diseases of circulatory system (DCS), neoplasms and other specific causes within 5 years, using socioeconomic, demographic and health features. The algorithms were trained in a random sample of 70% of subjects, and then tested in the other 30% unseen data.
The outcome with highest predictive performance was death by DRS (AUC-ROC = 0.89), followed by the other specific causes (AUC-ROC = 0.87), DCS (AUC-ROC = 0.67) and neoplasms (AUC-ROC = 0.52). Among only the 25% of individuals with the highest predicted risk of mortality from DRS were included 100% of the actual cases. The machine learning algorithms with the highest predictive performance were light gradient boosted machine and extreme gradient boosting.
The algorithms had a high predictive performance for DRS, but lower for DCS and neoplasms. Mortality prediction with machine learning can improve clinical decisions especially regarding targeted preventive measures for older individuals.
发展中国家的人口老龄化速度显著加快。在这种情况下,预防策略有助于减轻对医疗服务需求增加的负担。机器学习算法已越来越多地用于识别预防措施的优先候选者,其预测性能优于传统简约模型。
数据来自巴西圣保罗老年居民健康、福利和老龄化研究(SABE),这是一个具有代表性的样本。使用社会经济、人口统计学和健康特征,机器学习算法被应用于预测 5 年内呼吸系统疾病(DRS)、循环系统疾病(DCS)、肿瘤和其他特定原因导致的死亡。算法在 70%的随机样本中进行训练,然后在其余 30%的未见过的数据中进行测试。
预测性能最高的结果是 DRS 导致的死亡(AUC-ROC=0.89),其次是其他特定原因(AUC-ROC=0.87)、DCS(AUC-ROC=0.67)和肿瘤(AUC-ROC=0.52)。在 DRS 死亡风险最高的 25%的个体中,实际上包含了 100%的病例。预测性能最高的机器学习算法是轻梯度提升机和极端梯度提升机。
该算法对 DRS 具有较高的预测性能,但对 DCS 和肿瘤的预测性能较低。使用机器学习进行死亡率预测可以改善临床决策,特别是针对老年人的有针对性的预防措施。