使用c均值和模糊c均值对新冠病毒疾病（COVID-19）数据进行聚类以实现知识发现。

Clustering of COVID-19 data for knowledge discovery using c-means and fuzzy c-means.

作者信息

Afzal Asif, Ansari Zahid, Alshahrani Saad, Raj Arun K, Saheer Kuruniyan Mohamed, Ahamed Saleel C, Nisar Kottakkaran Sooppy

机构信息

Department of Mechanical Engineering, P. A. College of Engineering (Affiliated to Visvesvaraya Technological University, Belagavi), Mangaluru, India.

Electrical Engineering Section, University Polytechnic, Aligarh Muslim University, Aligarh, India.

出版信息

Results Phys. 2021 Oct;29:104639. doi: 10.1016/j.rinp.2021.104639. Epub 2021 Aug 21.

DOI:10.1016/j.rinp.2021.104639

PMID:34513577

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8424416/

Abstract

In this work, the partitioning clustering of COVID-19 data using c-Means (cM) and Fuzy c-Means (Fc-M) algorithms is carried out. Based on the data available from January 2020 with respect to location, i.e., longitude and latitude of the globe, the confirmed daily cases, recoveries, and deaths are clustered. In the analysis, the maximum cluster size is treated as a variable and is varied from 5 to 50 in both algorithms to find out an optimum number. The performance and validity indices of the clusters formed are analyzed to assess the quality of clusters. The validity indices to understand all the COVID-19 clusters' quality are analysed based on the Zahid SC (Separation Compaction) index, Xie-Beni Index, Fukuyama-Sugeno Index, Validity function, PC (performance coefficient), and CE (entropy) indexes. The analysis results pointed out that five clusters were identified as a major centroid where the pandemic looks concentrated. Additionally, the observations revealed that mainly the pandemic is distributed easily at any global location, and there are several centroids of COVID-19, which primarily act as epicentres. However, the three main COVID-19 clusters identified are 1) cases with value <50,000, 2) cases with a value between 0.1 million to 2 million, and 3) cases above 2 million. These centroids are located in the US, Brazil, and India, where the rest of the small clusters of the pandemic look oriented. Furthermore, the Fc-M technique seems to provide a much better cluster than the c-M algorithm.

摘要

在这项工作中，使用c均值（cM）和模糊c均值（Fc-M）算法对新冠肺炎数据进行了划分聚类。基于2020年1月以来全球各地的经度和纬度数据，对每日确诊病例、康复病例和死亡病例进行聚类。在分析中，将最大聚类大小视为一个变量，在两种算法中均从5变化到50，以找出最优数量。对形成的聚类的性能和有效性指标进行分析，以评估聚类的质量。基于扎希德SC（分离紧致性）指数、谢-贝尼指数、福山-菅野指数、有效性函数、性能系数（PC）和熵（CE）指数，分析用于理解所有新冠肺炎聚类质量的有效性指标。分析结果指出，五个聚类被确定为大流行似乎集中的主要质心。此外，观察结果显示，大流行主要容易在全球任何地点传播，并且有几个新冠肺炎质心，它们主要充当疫情中心。然而，确定的三个主要新冠肺炎聚类是：1）病例数<50000的聚类，2）病例数在10万至200万之间的聚类，以及3）病例数超过200万的聚类。这些质心位于美国、巴西和印度，大流行的其他小聚类似乎都指向这些地方。此外，Fc-M技术似乎比c-M算法提供了更好的聚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4be/8424416/c9265a1db4b6/gr1_lrg.jpg

相似文献

Clustering of COVID-19 data for knowledge discovery using c-means and fuzzy c-means.

Results Phys. 2021 Oct;29:104639. doi: 10.1016/j.rinp.2021.104639. Epub 2021 Aug 21.

An investigation into epidemiological situations of COVID-19 with fuzzy K-means and K-prototype clustering methods.

Sci Rep. 2023 Apr 17;13(1):6255. doi: 10.1038/s41598-023-33214-y.

Cross-domain, soft-partition clustering with diversity measure and knowledge reference.

Pattern Recognit. 2016 Feb;50:155-177. doi: 10.1016/j.patcog.2015.08.009.

A New Validity Index Based on Fuzzy Energy and Fuzzy Entropy Measures in Fuzzy Clustering Problems.

Entropy (Basel). 2020 Oct 23;22(11):1200. doi: 10.3390/e22111200.

A Self-Adaptive Fuzzy -Means Algorithm for Determining the Optimal Number of Clusters.

Comput Intell Neurosci. 2016;2016:2647389. doi: 10.1155/2016/2647389. Epub 2016 Nov 29.

Comparison of five cluster validity indices performance in brain [ F]FET-PET image segmentation using k-means.

Med Phys. 2017 Jan;44(1):209-220. doi: 10.1002/mp.12025.

An improved fuzzy c-means clustering algorithm based on shadowed sets and PSO.

Comput Intell Neurosci. 2014;2014:368628. doi: 10.1155/2014/368628. Epub 2014 Nov 12.

A novel hybrid fuzzy time series model for prediction of COVID-19 infected cases and deaths in India.

ISA Trans. 2022 May;124:69-81. doi: 10.1016/j.isatra.2021.07.003. Epub 2021 Jul 6.

Distributed $k$ -Means Algorithm and Fuzzy $c$ -Means Algorithm for Sensor Networks Based on Multiagent Consensus Theory.

IEEE Trans Cybern. 2017 Mar;47(3):772-783. doi: 10.1109/TCYB.2016.2526683. Epub 2016 Mar 3.

Brain tissue segmentation using improved kernelized rough-fuzzy C-means with spatio-contextual information from MRI.

Magn Reson Imaging. 2019 Oct;62:129-151. doi: 10.1016/j.mri.2019.06.010. Epub 2019 Jun 25.

引用本文的文献

An investigation into epidemiological situations of COVID-19 with fuzzy K-means and K-prototype clustering methods.

Sci Rep. 2023 Apr 17;13(1):6255. doi: 10.1038/s41598-023-33214-y.

A new mathematical model of multi-faced COVID-19 formulated by fractional derivative chains.

Adv Contin Discret Model. 2022;2022(1):6. doi: 10.1186/s13662-022-03677-w. Epub 2022 Jan 21.

Fuzzy Clustering Methods to Identify the Epidemiological Situation and Its Changes in European Countries during COVID-19.

Entropy (Basel). 2021 Dec 22;24(1):14. doi: 10.3390/e24010014.

本文引用的文献

Mathematical model for spreading of COVID-19 virus with the Mittag-Leffler kernel.

Numer Methods Partial Differ Equ. 2020 Nov 24. doi: 10.1002/num.22652.

Effects of blood glucose on vaspin secretion in patients with gestational diabetes mellitus.

Gynecol Endocrinol. 2021 Mar;37(3):221-224. doi: 10.1080/09513590.2020.1792438. Epub 2020 Jul 13.

Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil.

Chaos Solitons Fractals. 2020 Jun;135:109853. doi: 10.1016/j.chaos.2020.109853. Epub 2020 May 1.

Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis.

Chaos Solitons Fractals. 2020 Jun;135:109850. doi: 10.1016/j.chaos.2020.109850. Epub 2020 Apr 30.

Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan.

Chaos Solitons Fractals. 2020 Jun;135:109846. doi: 10.1016/j.chaos.2020.109846. Epub 2020 Apr 27.

Analysis and forecast of COVID-19 spreading in China, Italy and France.

Chaos Solitons Fractals. 2020 May;134:109761. doi: 10.1016/j.chaos.2020.109761. Epub 2020 Mar 21.

Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions.

J Thorac Dis. 2020 Mar;12(3):165-174. doi: 10.21037/jtd.2020.02.64.

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study.

Lancet Public Health. 2020 May;5(5):e261-e270. doi: 10.1016/S2468-2667(20)30073-6. Epub 2020 Mar 25.

Transmission potential of the novel coronavirus (COVID-19) onboard the diamond Princess Cruises Ship, 2020.

Infect Dis Model. 2020 Feb 29;5:264-270. doi: 10.1016/j.idm.2020.02.003. eCollection 2020.

Early dynamics of transmission and control of COVID-19: a mathematical modelling study.

Lancet Infect Dis. 2020 May;20(5):553-558. doi: 10.1016/S1473-3099(20)30144-4. Epub 2020 Mar 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用c均值和模糊c均值对新冠病毒疾病（COVID-19）数据进行聚类以实现知识发现。

Clustering of COVID-19 data for knowledge discovery using c-means and fuzzy c-means.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献