Suppr超能文献

KML形状:一种根据纵向数据(时间序列)形状进行聚类的有效方法。

kmlShape: An Efficient Method to Cluster Longitudinal Data (Time-Series) According to Their Shapes.

作者信息

Genolini Christophe, Ecochard René, Benghezal Mamoun, Driss Tarak, Andrieu Sandrine, Subtil Fabien

机构信息

Inserm UMR 1027, University of Toulouse III, Toulouse, France.

CeRSM (EA 2931), UFR STAPS, University Paris Ouest-Nanterre-La Défense, Nanterre, France.

出版信息

PLoS One. 2016 Jun 3;11(6):e0150738. doi: 10.1371/journal.pone.0150738. eCollection 2016.

Abstract

BACKGROUND

Longitudinal data are data in which each variable is measured repeatedly over time. One possibility for the analysis of such data is to cluster them. The majority of clustering methods group together individual that have close trajectories at given time points. These methods group trajectories that are locally close but not necessarily those that have similar shapes. However, in several circumstances, the progress of a phenomenon may be more important than the moment at which it occurs. One would thus like to achieve a partitioning where each group gathers individuals whose trajectories have similar shapes whatever the time lag between them.

METHOD

In this article, we present a longitudinal data partitioning algorithm based on the shapes of the trajectories rather than on classical distances. Because this algorithm is time consuming, we propose as well two data simplification procedures that make it applicable to high dimensional datasets.

RESULTS

In an application to Alzheimer disease, this algorithm revealed a "rapid decline" patient group that was not found by the classical methods. In another application to the feminine menstrual cycle, the algorithm showed, contrarily to the current literature, that the luteinizing hormone presents two peaks in an important proportion of women (22%).

摘要

背景

纵向数据是指每个变量随时间重复测量得到的数据。分析此类数据的一种可能性是对其进行聚类。大多数聚类方法会将在给定时间点具有相近轨迹的个体归为一组。这些方法将局部相近的轨迹归为一组,但不一定是那些形状相似的轨迹。然而,在某些情况下,一种现象的进展可能比它发生的时刻更重要。因此,人们希望实现一种划分,使得每个组聚集轨迹形状相似的个体,无论它们之间的时间间隔如何。

方法

在本文中,我们提出了一种基于轨迹形状而非经典距离的纵向数据划分算法。由于该算法耗时,我们还提出了两种数据简化程序,使其适用于高维数据集。

结果

在阿尔茨海默病的应用中,该算法揭示了一个经典方法未发现的“快速衰退”患者组。在另一个关于女性月经周期的应用中,与当前文献相反,该算法表明,在相当比例的女性(22%)中,促黄体生成素会出现两个峰值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fba2/4892497/4e64c5716393/pone.0150738.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验