Suppr超能文献

揭示异常值对时间数据集聚类演化的影响:实证分析

Uncovering the impact of outliers on clusters' evolution in temporal data-sets: an empirical analysis.

作者信息

Atif Muhammad, Farooq Muhammad, Shafiq Muhammad, Alballa Tmader, Abdualziz Alhabeeb Somayah, Abd El-Wahed Khalifa Hamiden

机构信息

Department of Statistics, University of Peshawar, Peshawar, Pakistan.

Institute of Numerical Sciences, Kohat University of Science and Technology, Kohat, Pakistan.

出版信息

Sci Rep. 2024 Dec 28;14(1):30674. doi: 10.1038/s41598-024-75928-7.

Abstract

This study investigates the impact of outliers on the evolution of clusters in temporal data-sets. Monitoring and tracing cluster transitions of temporal data sets allow us to observe how clusters evolve and change over time. By tracking the movement of data points between clusters, we can gain insights into the underlying patterns, trends, and dynamics of the data. This understanding is essential for making informed decisions and drawing meaningful conclusions from the clustering results. Cluster evolution refers to the changes that occur in the clustering results over time due to the arrival of new data points. The changes in cluster solutions are classified as external and internal transitions. The study employs the survival ratio and history cost function to investigate the effects of outliers on changes experienced by the clusters at successive time points. The results demonstrate that outliers have a significant impact on cluster evolution, and appropriate outlier handling techniques are necessary to obtain reliable clustering results. The findings of this study provide useful insights for practitioners and researchers in the field of stream clustering and can help guide the development of more robust and accurate stream clustering algorithms.

摘要

本研究调查了离群值对时态数据集中聚类演变的影响。监测和追踪时态数据集的聚类转变,使我们能够观察聚类如何随时间演变和变化。通过跟踪数据点在聚类之间的移动,我们可以深入了解数据的潜在模式、趋势和动态。这种理解对于做出明智决策以及从聚类结果中得出有意义的结论至关重要。聚类演变是指由于新数据点的到来,聚类结果随时间发生的变化。聚类解决方案的变化分为外部和内部转变。该研究采用存活率和历史成本函数来研究离群值对连续时间点上聚类所经历变化的影响。结果表明,离群值对聚类演变有重大影响,并且需要适当的离群值处理技术来获得可靠的聚类结果。本研究的结果为流聚类领域的从业者和研究人员提供了有用的见解,并有助于指导更强大、准确的流聚类算法的开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58b9/11681016/560ece406d32/41598_2024_75928_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验