Barmpas Petros, Tasoulis Sotiris, Vrahatis Aristidis G, Georgakopoulos Spiros V, Anagnostou Panagiotis, Prina Matthew, Ayuso-Mateos José Luis, Bickenbach Jerome, Bayes Ivet, Bobak Martin, Caballero Francisco Félix, Chatterji Somnath, Egea-Cortés Laia, García-Esquinas Esther, Leonardi Matilde, Koskinen Seppo, Koupil Ilona, Paja K Andrzej, Prince Martin, Sanderson Warren, Scherbov Sergei, Tamosiunas Abdonas, Galas Aleksander, Haro Josep Maria, Sanchez-Niubo Albert, Plagianakos Vassilis P, Panagiotakos Demosthenes
Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece.
Department of Mathematics, University of Thessaly, Lamia, Greece.
Health Inf Sci Syst. 2022 Apr 18;10(1):6. doi: 10.1007/s13755-022-00171-1. eCollection 2022 Dec.
The ATHLOS cohort is composed of several harmonized datasets of international groups related to health and aging. As a result, the Healthy Aging index has been constructed based on a selection of variables from 16 individual studies. In this paper, we consider additional variables found in ATHLOS and investigate their utilization for predicting the Healthy Aging index. For this purpose, motivated by the volume and diversity of the dataset, we focus our attention upon data clustering, where unsupervised learning is utilized to enhance prediction power. Thus we show the predictive utility of exploiting hidden data structures. In addition, we demonstrate that imposed computation bottlenecks can be surpassed when using appropriate hierarchical clustering, within a clustering for ensemble classification scheme, while retaining prediction benefits. We propose a complete methodology that is evaluated against baseline methods and the original concept. The results are very encouraging suggesting further developments in this direction along with applications in tasks with similar characteristics. A straightforward open source implementation for the R project is also provided (https://github.com/Petros-Barmpas/HCEP).
The online version contains supplementary material available at 10.1007/s13755-022-00171-1.
ATHLOS队列由几个与健康和衰老相关的国际组织的协调数据集组成。因此,基于从16项个体研究中选择的变量构建了健康衰老指数。在本文中,我们考虑了ATHLOS中发现的其他变量,并研究它们在预测健康衰老指数方面的用途。为此,受数据集的规模和多样性的推动,我们将注意力集中在数据聚类上,利用无监督学习来增强预测能力。因此,我们展示了利用隐藏数据结构的预测效用。此外,我们证明,在用于集成分类方案的聚类中使用适当的层次聚类时,可以超越强加的计算瓶颈,同时保留预测优势。我们提出了一种完整的方法,并与基线方法和原始概念进行了评估。结果非常令人鼓舞,表明可以朝着这个方向进一步发展,并应用于具有类似特征的任务。还提供了一个针对R项目的直接开源实现(https://github.com/Petros-Barmpas/HCEP)。
在线版本包含可在10.1007/s13755-022-00171-1上获取的补充材料。