Kumar Sanjay
Department of Statistics, Central University of Rajasthan, Bandarsindri, Kishangarh, Ajmer, Rajasthan 305817 India.
Ann Data Sci. 2020;7(3):417-425. doi: 10.1007/s40745-020-00289-7. Epub 2020 May 19.
It is a great challenge of identification as well as formation of groups of infectious disease data set. Data mining, a process of uncovering silent characteristics of big data is one of such techniques which have nowadays become more popular for treating massive volume of infectious disease data set. In the current study, we apply cluster analysis, one of the data mining techniques to classify real groups of infectious disease "novel corona virus disease (COVID-19)" data set of different states and union territories (UTs) in India according to their high similarity to each other. The results obtained permit us to have a sense of clusters of affected Indian states and UTs. The main objective of clustering in this study is to optimize monitoring techniques in affected states and UTs in India which will be very valuable to the government, doctors, the police and others involved in understanding seriousness of the spread of novel coronavirus (COVID-19) to improve government policies, decisions, medical facilities (ventilators, testing kits, masks etc.), treatment etc. to reduce number of infected and deceased persons.
识别和形成传染病数据集组是一项巨大的挑战。数据挖掘作为一种揭示大数据潜在特征的过程,是如今处理大量传染病数据集时更受欢迎的技术之一。在本研究中,我们应用数据挖掘技术之一的聚类分析,根据印度不同邦和联邦属地(UTs)的“新型冠状病毒病(COVID-19)”数据集彼此之间的高度相似性,对其实际组进行分类。所获得的结果使我们能够了解受影响的印度邦和联邦属地的聚类情况。本研究中聚类的主要目的是优化印度受影响邦和联邦属地的监测技术,这对政府、医生、警方及其他参与了解新型冠状病毒(COVID-19)传播严重性的人员非常有价值,有助于改进政府政策、决策、医疗设施(呼吸机、检测试剂盒、口罩等)、治疗等,以减少感染和死亡人数。