Suppr超能文献

FRELSA:一个源于 ELSA 并通过机器学习模型进行评估的老年人虚弱数据集。

FRELSA: A dataset for frailty in elderly people originated from ELSA and evaluated through machine learning models.

机构信息

Universidad Politécnica de Madrid, Av. Complutense, 30, 28040, Madrid, Spain.

出版信息

Int J Med Inform. 2024 Dec;192:105603. doi: 10.1016/j.ijmedinf.2024.105603. Epub 2024 Aug 19.

Abstract

BACKGROUND

Frailty is an age-related syndrome characterized by loss of strength and exhaustion and associated with multi-morbidity. Early detection and prediction of the appearance of frailty could help older people age better and prevent them from needing invasive and expensive treatments. Machine learning techniques show promising results in creating a medical support tool for such a task.

METHODS

This study aims to create a dataset for machine learning-based frailty studies, using Fried's Frailty Phenotype definition. Starting from a longitudinal study on aging in the UK population, we defined a frailty label for each subject. We evaluated the definition by training seven different models for detecting frailty with data that were contemporary to the ones used for the definition. We then integrated more data from two years before to obtain prediction models with a 24-month horizon. Features selection was performed using the MultiSURF algorithm, which ranks all features in order of relevance to the detection or prediction task.

RESULTS

We present a new frailty dataset of 5303 subjects and more than 6500 available features. It is publicly available, provided one has access to the original English Longitudinal Study of Ageing dataset. The dataset is balanced after grouping frailty with pre-frailty, and it is suitable for multiclass or binary classification and prediction problems. The seven tested architectures performed similarly, forming a solid baseline that can be improved with future work. Linear regression achieved the best F-score and AUROC in detection and prediction tasks.

CONCLUSIONS

Creating new frailty-annotated datasets of this size is necessary to develop and improve the frailty prediction techniques. We have shown that our dataset can be used to study and test machine learning models to detect and predict frailty. Future work should improve models' architecture and performance, consider explainability, and possibly enrich the dataset with older waves.

摘要

背景

衰弱是一种与年龄相关的综合征,其特征是力量和疲惫的丧失,与多种疾病有关。早期发现和预测衰弱的出现可以帮助老年人更好地衰老,并防止他们需要侵入性和昂贵的治疗。机器学习技术在创建用于此类任务的医疗支持工具方面显示出有希望的结果。

方法

本研究旨在使用 Fried 的衰弱表型定义创建一个基于机器学习的衰弱研究数据集。从英国人口老龄化的纵向研究开始,我们为每个研究对象定义了一个衰弱标签。我们使用与定义中使用的数据同时代的数据来评估该定义,通过训练七个不同的模型来检测衰弱。然后,我们整合了两年前的数据,以获得具有 24 个月预测模型。使用 MultiSURF 算法进行特征选择,该算法按与检测或预测任务的相关性对所有特征进行排序。

结果

我们提出了一个新的 5303 名受试者和超过 6500 个可用特征的衰弱数据集。它是公开的,只要有访问原始英国纵向老龄化研究数据集的权限。在将衰弱与前衰弱分组后,数据集是平衡的,它适合于多类或二分类和预测问题。测试的七种架构表现相似,形成了一个可以通过未来工作改进的可靠基线。线性回归在检测和预测任务中获得了最佳的 F 分数和 AUC。

结论

创建这种规模的新衰弱注释数据集对于开发和改进衰弱预测技术是必要的。我们已经表明,我们的数据集可用于研究和测试机器学习模型以检测和预测衰弱。未来的工作应改进模型的架构和性能,考虑可解释性,并可能使用更老的波次丰富数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验