Suppr超能文献

使用数据丰富框架的基于增强多模型机器学习的痴呆症检测:利用维度的优势

Enhanced Multi-Model Machine Learning-Based Dementia Detection Using a Data Enrichment Framework: Leveraging the Blessing of Dimensionality.

作者信息

Yongcharoenchaiyasit Khomkrit, Arwatchananukul Sujitra, Hristov Georgi, Temdee Punnarumol

机构信息

Computer and Communication Engineering for Capacity Building Research Center, Chiang Rai 57100, Thailand.

School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai 57100, Thailand.

出版信息

Bioengineering (Basel). 2025 May 30;12(6):592. doi: 10.3390/bioengineering12060592.

Abstract

The early diagnosis of dementia, a progressive condition impairing memory, cognition, and functional ability in older adults, is essential for timely intervention and improved patient outcomes. This study proposes a novel multiclass classification that differentiates dementia from other comorbid conditions, specifically cardiovascular diseases, including heart failure and aortic valve disorder, by leveraging the "blessing of dimensionality" to enhance predictive performance while ensuring feature accessibility. Using a dataset of 26,474 electronic health records from two hospitals in Chiang Rai, Thailand, the proposed framework introduced clinically informed feature augmentation to enhance model generalizability. Furthermore, the borderline synthetic minority oversampling technique was employed to address class imbalance, enhancing the model's performance for minority classes. This study systematically evaluated a suite of machine learning models, including extreme gradient boosting, gradient boosting, random forest, support vector machine, decision trees, k-nearest neighbors, extra trees, and TabNet, across both the original and enriched datasets, with the latter integrating augmented features and synthetic data. Predictive performance was assessed using accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve, and area under the precision-recall curve. The results revealed that all the models exhibited consistent performance improvements with the enriched dataset, affirming the value of dimensionality when guided by domain expertise.

摘要

痴呆症是一种会损害老年人记忆、认知和功能能力的进行性疾病,其早期诊断对于及时干预和改善患者预后至关重要。本研究提出了一种新颖的多类分类方法,通过利用“维度诅咒”来提高预测性能,同时确保特征的可及性,从而将痴呆症与其他合并症,特别是心血管疾病(包括心力衰竭和主动脉瓣疾病)区分开来。使用来自泰国清莱两家医院的26474份电子健康记录数据集,所提出的框架引入了临床知情的特征增强方法,以提高模型的泛化能力。此外,采用边界合成少数过采样技术来解决类别不平衡问题,提高模型对少数类别的性能。本研究系统地评估了一系列机器学习模型,包括极端梯度提升、梯度提升、随机森林、支持向量机、决策树、k近邻、极端随机树和TabNet,在原始数据集和丰富数据集上进行评估,后者整合了增强特征和合成数据。使用准确率、精确率、召回率、F1分数、受试者工作特征曲线下面积和精确率-召回率曲线下面积来评估预测性能。结果表明,所有模型在丰富数据集上都表现出一致的性能提升,证实了在领域专业知识指导下维度的价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e598/12189127/9fc4caa8d683/bioengineering-12-00592-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验