有时：用于时间序列聚类的自组织映射及其在重病对话中的应用

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

作者信息

Javed Ali, Rizzo Donna M, Lee Byung Suk, Gramling Robert

机构信息

Department of Medicine, Stanford University, 300 Pasteur Dr, Stanford, CA 94305 USA.

Department of Computer Science, University of Vermont, Burlington, VT USA.

出版信息

Data Min Knowl Discov. 2024;38(3):813-839. doi: 10.1007/s10618-023-00979-9. Epub 2023 Oct 20.

DOI:10.1007/s10618-023-00979-9

PMID:38711534

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11069464/

Abstract

UNLABELLED

There is demand for scalable algorithms capable of clustering and analyzing large time series data. The Kohonen self-organizing map (SOM) is an unsupervised artificial neural network for clustering, visualizing, and reducing the dimensionality of complex data. Like all clustering methods, it requires a measure of similarity between input data (in this work time series). Dynamic time warping (DTW) is one such measure, and a top performer that accommodates distortions when aligning time series. Despite its popularity in clustering, DTW is limited in practice because the runtime complexity is quadratic with the length of the time series. To address this, we present a new a self-organizing map for clustering TIME Series, called SOMTimeS, which uses DTW as the distance measure. The method has similar accuracy compared with other DTW-based clustering algorithms, yet scales better and runs faster. The computational performance stems from the pruning of unnecessary DTW computations during the SOM's training phase. For comparison, we implement a similar pruning strategy for K-means, and call the latter K-TimeS. SOMTimeS and K-TimeS pruned 43% and 50% of the total DTW computations, respectively. Pruning effectiveness, accuracy, execution time and scalability are evaluated using 112 benchmark time series datasets from the UC Riverside classification archive, and show that for similar accuracy, a 1.8 speed-up on average for SOMTimeS and K-TimeS, respectively with that rates vary between 1 and 18 depending on the dataset. We also apply SOMTimeS to a healthcare study of patient-clinician serious illness conversations to demonstrate the algorithm's utility with complex, temporally sequenced natural language.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s10618-023-00979-9.

摘要

未标注

对于能够对大型时间序列数据进行聚类和分析的可扩展算法存在需求。科霍宁自组织映射（SOM）是一种用于聚类、可视化和降低复杂数据维度的无监督人工神经网络。与所有聚类方法一样，它需要一种输入数据（在本研究中为时间序列）之间的相似性度量。动态时间规整（DTW）就是这样一种度量，并且是在对齐时间序列时能够适应扭曲的顶级方法。尽管DTW在聚类中很受欢迎，但在实践中它受到限制，因为运行时复杂度与时间序列的长度呈二次关系。为了解决这个问题，我们提出了一种用于聚类时间序列的新自组织映射，称为SOMTimeS，它使用DTW作为距离度量。与其他基于DTW的聚类算法相比，该方法具有相似的准确性，但扩展性更好且运行速度更快。计算性能源于在SOM训练阶段对不必要的DTW计算进行修剪。为了进行比较，我们为K均值实现了类似的修剪策略，并将后者称为K-TimeS。SOMTimeS和K-TimeS分别修剪了总DTW计算的43%和50%。使用来自加州大学河滨分校分类存档的112个基准时间序列数据集评估了修剪效果、准确性、执行时间和可扩展性，结果表明，在相似准确性的情况下，SOMTimeS和K-TimeS平均分别加速了1.8倍，加速率在1到18之间，具体取决于数据集。我们还将SOMTimeS应用于一项关于患者与临床医生严重疾病对话的医疗保健研究，以证明该算法在复杂的、按时间顺序排列的自然语言中的实用性。

补充信息

在线版本包含可在10.1007/s10618-023-00979-9获取的补充材料。

相似文献

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.有时：用于时间序列聚类的自组织映射及其在重病对话中的应用

Data Min Knowl Discov. 2024;38(3):813-839. doi: 10.1007/s10618-023-00979-9. Epub 2023 Oct 20.

Using dynamic time warping self-organizing maps to characterize diurnal patterns in environmental exposures.使用动态时间规整自组织映射来刻画环境暴露的昼夜模式。

Sci Rep. 2021 Dec 15;11(1):24052. doi: 10.1038/s41598-021-03515-1.

Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance.基于动态时间规整距离的时间序列数据的增量模糊 C 均值聚类。

PLoS One. 2018 May 24;13(5):e0197499. doi: 10.1371/journal.pone.0197499. eCollection 2018.

EventDTW: An Improved Dynamic Time Warping Algorithm for Aligning Biomedical Signals of Nonuniform Sampling Frequencies.事件 DTW：一种用于对齐非均匀采样频率生物医学信号的改进动态时间规整算法。

Sensors (Basel). 2020 May 9;20(9):2700. doi: 10.3390/s20092700.

Comparison of time series clustering methods for identifying novel subphenotypes of patients with infection.比较时间序列聚类方法，以鉴定感染患者的新型亚表型。

J Am Med Inform Assoc. 2023 May 19;30(6):1158-1166. doi: 10.1093/jamia/ocad063.

Implementation and evaluation of a multivariate abstraction-based, interval-based dynamic time-warping method as a similarity measure for longitudinal medical records.基于多元抽象和区间的动态时间规整方法的实现和评估，作为一种用于纵向医疗记录的相似性度量方法。

J Biomed Inform. 2021 Nov;123:103919. doi: 10.1016/j.jbi.2021.103919. Epub 2021 Oct 8.

Information Granulation-Based Fuzzy Clustering of Time Series.基于信息粒度的时间序列模糊聚类。

IEEE Trans Cybern. 2021 Dec;51(12):6253-6261. doi: 10.1109/TCYB.2020.2970455. Epub 2021 Dec 22.

Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series.改进动态时间规整算法（DTW-S）的开发与应用于灵长类动物大脑表达时间序列分析。

BMC Bioinformatics. 2011 Aug 18;12:347. doi: 10.1186/1471-2105-12-347.

Degree-Pruning Dynamic Programming Approaches to Central Time Series Minimizing Dynamic Time Warping Distance.基于剪枝动态规划的中心时间序列最小化动态时间规整距离方法。

IEEE Trans Cybern. 2017 Jul;47(7):1719-1729. doi: 10.1109/TCYB.2016.2555578. Epub 2016 Jun 28.

Efficient Kernel-Based Subsequence Search for Enabling Health Monitoring Services in IoT-Based Home Setting.基于高效核的子序列搜索，实现基于物联网的家庭环境中的健康监测服务。

Sensors (Basel). 2019 Nov 27;19(23):5192. doi: 10.3390/s19235192.

引用本文的文献

Gaining insights into epigenetic memories through artificial intelligence and omics science in plants.通过人工智能和植物组学科学深入了解表观遗传记忆。

J Integr Plant Biol. 2025 Sep;67(9):2320-2349. doi: 10.1111/jipb.13953. Epub 2025 Jun 24.

Layer-by-layer unsupervised clustering of statistically relevant fluctuations in noisy time-series data of complex dynamical systems.复杂动力系统噪声时间序列数据中统计相关波动的逐层无监督聚类。

Proc Natl Acad Sci U S A. 2024 Aug 13;121(33):e2403771121. doi: 10.1073/pnas.2403771121. Epub 2024 Aug 7.

本文引用的文献

The Project: Feasibility and Acceptability of a Remotely Delivered Intervention to Alleviate Grief during the COVID-19 Pandemic.项目名称：远程干预缓解 COVID-19 大流行期间悲伤的可行性和可接受性研究。

J Palliat Med. 2023 Mar;26(3):327-333. doi: 10.1089/jpm.2022.0261. Epub 2022 Sep 2.

Toward a basic science of communication in serious illness.迈向重症疾病沟通的基础科学。

Patient Educ Couns. 2022 Jul;105(7):1963-1969. doi: 10.1016/j.pec.2022.03.019. Epub 2022 Apr 8.

Using dynamic time warping self-organizing maps to characterize diurnal patterns in environmental exposures.使用动态时间规整自组织映射来刻画环境暴露的昼夜模式。

Sci Rep. 2021 Dec 15;11(1):24052. doi: 10.1038/s41598-021-03515-1.

Conversational stories & self organizing maps: Innovations for the scalable study of uncertainty in healthcare communication.对话式故事和自组织映射：医疗保健沟通中不确定性可扩展研究的创新。

Patient Educ Couns. 2021 Nov;104(11):2616-2621. doi: 10.1016/j.pec.2021.07.043. Epub 2021 Jul 29.

A general model of conversational dynamics and an example application in serious illness communication.会话动态的一般模型及其在重病沟通中的应用示例。

PLoS One. 2021 Jul 1;16(7):e0253124. doi: 10.1371/journal.pone.0253124. eCollection 2021.

A full-parallel implementation of Self-Organizing Maps on hardware.硬件上的自组织映射的全并行实现。

Neural Netw. 2021 Nov;143:818-827. doi: 10.1016/j.neunet.2021.05.021. Epub 2021 May 21.

SOMprocessor: A high throughput FPGA-based architecture for implementing Self-Organizing Maps and its application to video processing.SOMprocessor：一种基于 FPGA 的高吞吐量架构，用于实现自组织映射及其在视频处理中的应用。

Neural Netw. 2020 May;125:349-362. doi: 10.1016/j.neunet.2020.02.019. Epub 2020 Mar 3.

Story Arcs in Serious Illness: Natural Language Processing features of Palliative Care Conversations.严重疾病中的故事情节：姑息治疗对话的自然语言处理特征。

Patient Educ Couns. 2020 Apr;103(4):826-832. doi: 10.1016/j.pec.2019.11.021. Epub 2019 Dec 9.

Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping.在动态时间规整下搜索和挖掘数万亿时间序列子序列

KDD. 2012 Aug;2012:262-270. doi: 10.1145/2339530.2339576.

The contagion of optimism: The relationship between patient optimism and palliative care clinician overestimation of survival among hospitalized patients with advanced cancer.乐观情绪的传播：在住院的晚期癌症患者中，患者的乐观情绪与姑息治疗临床医生对生存的过高估计之间的关系。

Psychooncology. 2019 Jun;28(6):1286-1292. doi: 10.1002/pon.5080. Epub 2019 Apr 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

有时：用于时间序列聚类的自组织映射及其在重病对话中的应用

Somtimes: self organizing maps for time series clustering and its application to serious illness conversations.

作者信息

机构信息

出版信息

UNLABELLED

SUPPLEMENTARY INFORMATION

未标注

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献