Suppr超能文献

基于健康数据因果分析优化策略的多特征选择

Multiple feature selection based on an optimization strategy for causal analysis of health data.

作者信息

Cong Ruichen, Deng Ou, Nishimura Shoji, Ogihara Atsushi, Jin Qun

机构信息

Graduate School of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, 359-1192 Saitama Japan.

Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa, 359-1192 Saitama Japan.

出版信息

Health Inf Sci Syst. 2024 Nov 12;12(1):52. doi: 10.1007/s13755-024-00312-8. eCollection 2024 Dec.

Abstract

PURPOSE

Recent advancements in information technology and wearable devices have revolutionized healthcare through health data analysis. Identifying significant relationships in complex health data enhances healthcare and public health strategies. In health analytics, causal graphs are important for investigating the relationships among health features. However, they face challenges owing to the large number of features, complexity, and computational demands. Feature selection methods are useful for addressing these challenges. In this paper, we present a framework for multiple feature selection based on an optimization strategy for causal analysis of health data.

METHODS

We select multiple health features based on an optimization strategy. First, we define a Weighted Total Score (WTS) index to assess the feature importance after the combination of different feature selection methods. To explore an optimal set of weights for each method, we design a multiple feature selection algorithm integrated with the greedy algorithm. The features are then ranked according to their WTS, enabling selection of the most important ones. After that, causal graphs are constructed based on the selected features, and the statistical significance of the paths is assessed. Furthermore, evaluation experiments are conducted on an experiment dataset collected for this study and an open dataset for diabetes.

RESULTS

The results demonstrate that our approach outperforms baseline models by reducing the number of features while improving model performance. Moreover, the statistical significance of the relationships between features uncovered through causal graphs is validated for both datasets.

CONCLUSION

By using the proposed framework for multiple feature selection based on an optimization strategy for causal analysis, the number of features is reduced and the causal relationships are uncovered and validated.

摘要

目的

信息技术和可穿戴设备的最新进展通过健康数据分析彻底改变了医疗保健。识别复杂健康数据中的重要关系可增强医疗保健和公共卫生策略。在健康分析中,因果图对于研究健康特征之间的关系很重要。然而,由于特征数量众多、复杂性和计算需求,它们面临挑战。特征选择方法有助于应对这些挑战。在本文中,我们提出了一个基于优化策略的多特征选择框架,用于健康数据的因果分析。

方法

我们基于优化策略选择多个健康特征。首先,我们定义一个加权总分(WTS)指数来评估不同特征选择方法组合后的特征重要性。为了探索每种方法的最优权重集,我们设计了一种与贪心算法集成的多特征选择算法。然后根据特征的WTS对其进行排序,从而能够选择最重要的特征。之后,基于所选特征构建因果图,并评估路径的统计显著性。此外,对为本研究收集的实验数据集和一个糖尿病开放数据集进行了评估实验。

结果

结果表明,我们的方法通过减少特征数量同时提高模型性能优于基线模型。此外,通过因果图发现的特征之间关系的统计显著性在两个数据集上均得到验证。

结论

通过使用基于因果分析优化策略的多特征选择框架,减少了特征数量,揭示并验证了因果关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ace/11554952/33eb5564be05/13755_2024_312_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验