Suppr超能文献

有时:用于时间序列聚类的自组织映射及其在重病对话中的应用

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

作者信息

Javed Ali, Rizzo Donna M, Lee Byung Suk, Gramling Robert

机构信息

Department of Medicine, Stanford University, 300 Pasteur Dr, Stanford, CA 94305 USA.

Department of Computer Science, University of Vermont, Burlington, VT USA.

出版信息

Data Min Knowl Discov. 2024;38(3):813-839. doi: 10.1007/s10618-023-00979-9. Epub 2023 Oct 20.

Abstract

UNLABELLED

There is demand for scalable algorithms capable of clustering and analyzing large time series data. The Kohonen self-organizing map (SOM) is an unsupervised artificial neural network for clustering, visualizing, and reducing the dimensionality of complex data. Like all clustering methods, it requires a measure of similarity between input data (in this work time series). Dynamic time warping (DTW) is one such measure, and a top performer that accommodates distortions when aligning time series. Despite its popularity in clustering, DTW is limited in practice because the runtime complexity is quadratic with the length of the time series. To address this, we present a new a self-organizing map for clustering TIME Series, called SOMTimeS, which uses DTW as the distance measure. The method has similar accuracy compared with other DTW-based clustering algorithms, yet scales better and runs faster. The computational performance stems from the pruning of unnecessary DTW computations during the SOM's training phase. For comparison, we implement a similar pruning strategy for K-means, and call the latter K-TimeS. SOMTimeS and K-TimeS pruned 43% and 50% of the total DTW computations, respectively. Pruning effectiveness, accuracy, execution time and scalability are evaluated using 112 benchmark time series datasets from the UC Riverside classification archive, and show that for similar accuracy, a 1.8 speed-up on average for SOMTimeS and K-TimeS, respectively with that rates vary between 1 and 18 depending on the dataset. We also apply SOMTimeS to a healthcare study of patient-clinician serious illness conversations to demonstrate the algorithm's utility with complex, temporally sequenced natural language.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s10618-023-00979-9.

摘要

未标注

对于能够对大型时间序列数据进行聚类和分析的可扩展算法存在需求。科霍宁自组织映射(SOM)是一种用于聚类、可视化和降低复杂数据维度的无监督人工神经网络。与所有聚类方法一样,它需要一种输入数据(在本研究中为时间序列)之间的相似性度量。动态时间规整(DTW)就是这样一种度量,并且是在对齐时间序列时能够适应扭曲的顶级方法。尽管DTW在聚类中很受欢迎,但在实践中它受到限制,因为运行时复杂度与时间序列的长度呈二次关系。为了解决这个问题,我们提出了一种用于聚类时间序列的新自组织映射,称为SOMTimeS,它使用DTW作为距离度量。与其他基于DTW的聚类算法相比,该方法具有相似的准确性,但扩展性更好且运行速度更快。计算性能源于在SOM训练阶段对不必要的DTW计算进行修剪。为了进行比较,我们为K均值实现了类似的修剪策略,并将后者称为K-TimeS。SOMTimeS和K-TimeS分别修剪了总DTW计算的43%和50%。使用来自加州大学河滨分校分类存档的112个基准时间序列数据集评估了修剪效果、准确性、执行时间和可扩展性,结果表明,在相似准确性的情况下,SOMTimeS和K-TimeS平均分别加速了1.8倍,加速率在1到18之间,具体取决于数据集。我们还将SOMTimeS应用于一项关于患者与临床医生严重疾病对话的医疗保健研究,以证明该算法在复杂的、按时间顺序排列的自然语言中的实用性。

补充信息

在线版本包含可在10.1007/s10618-023-00979-9获取的补充材料。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验