Venkatasubramaniam A, Evers L, Thakuriah P, Ampountolas K
The Alan Turing Institute, The British Library, London, UK.
School of Mathematics and Statistics, University of Glasgow, Glasgow, UK.
J Appl Stat. 2021 Nov 16;50(4):909-926. doi: 10.1080/02664763.2021.2001443. eCollection 2023.
This paper presents a new method called the (FDCA) that seeks to identify spatially contiguous clusters and incorporate changes in temporal patterns across overcrowded networks. This method is motivated by a graph-based network composed of sensors arranged over space where recorded observations for each sensor represent a multi-modal distribution. The proposed method is fully non-parametric and generates clusters within an agglomerative hierarchical clustering approach based on a measure of distance that defines a cumulative distribution function over temporal changes for different locations in space. Traditional hierarchical clustering algorithms that are spatially adapted do not typically accommodate the temporal characteristics of the underlying data. The effectiveness of the FDCA is illustrated using an application to both empirical and simulated data from about 400 sensors in a 2.5 square miles network area in downtown San Francisco, California. The results demonstrate the superior ability of the the FDCA in identifying compared to functional only and distributional only algorithms and similar performance to a model-based clustering algorithm.
本文提出了一种名为(FDCA)的新方法,该方法旨在识别空间上相邻的聚类,并纳入过度拥挤网络中时间模式的变化。此方法的灵感来源于一个基于图的网络,该网络由分布在空间中的传感器组成,每个传感器记录的观测值代表一种多模态分布。所提出的方法完全是非参数的,并在凝聚层次聚类方法中基于一种距离度量生成聚类,该距离度量定义了空间中不同位置随时间变化的累积分布函数。传统的空间自适应层次聚类算法通常不考虑基础数据的时间特征。通过将FDCA应用于加利福尼亚州旧金山闹市区一个2.5平方英里网络区域内约400个传感器的经验数据和模拟数据,说明了FDCA的有效性。结果表明,与仅基于功能和仅基于分布的算法相比,FDCA在识别方面具有卓越能力,并且与基于模型的聚类算法具有相似的性能。