Singh Koushlendra Kumar, Kumar Suraj, Dixit Prachi, Bajpai Manish Kumar
National Institute of Technology, Jamshedpur, India.
Jai Narayan Vyas University, Jodhpur, India.
Appl Intell (Dordr). 2021;51(5):2714-2726. doi: 10.1007/s10489-020-01948-1. Epub 2020 Nov 3.
Corona Virus Disease 2019 (COVID19) has emerged as a global medical emergency in the contemporary time. The spread scenario of this pandemic has shown many variations. Keeping all this in mind, this article is written after various studies and analysis on the latest data on COVID19 spread, which also includes the demographic and environmental factors. After gathering data from various resources, all data is integrated and passed into different Machine Learning Models in order to check its appropriateness. Ensemble Learning Technique, Random Forest, gives a good evaluation score on the tested data. Through this technique, various important factors are recognized and their contribution to the spread is analyzed. Also, linear relationships between various features are plotted through the heat map of Pearson Correlation matrix. Finally, Kalman Filter is used to estimate future spread of SARS-Cov-2, which shows good results on the tested data. The inferences from the Random Forest feature importance and Pearson Correlation gives many similarities and few dissimilarities, and these techniques successfully identify the different contributing factors. The Kalman Filter gives a satisfying result for short term estimation, but not so good performance for long term forecasting. Overall, the analysis, plots, inferences and forecast are satisfying and can help a lot in fighting the spread of the virus.
2019年冠状病毒病(COVID-19)已成为当代全球医疗紧急事件。这场大流行的传播情况呈现出多种变化。考虑到所有这些因素,本文是在对COVID-19传播的最新数据进行各种研究和分析后撰写的,其中还包括人口统计学和环境因素。从各种资源收集数据后,所有数据被整合并输入到不同的机器学习模型中,以检查其适用性。集成学习技术——随机森林,在测试数据上给出了良好的评估分数。通过这种技术,识别出各种重要因素并分析它们对传播的贡献。此外,通过皮尔逊相关矩阵的热图绘制各种特征之间的线性关系。最后,使用卡尔曼滤波器来估计SARS-CoV-2的未来传播情况,在测试数据上显示出良好的结果。随机森林特征重要性和皮尔逊相关性的推断有许多相似之处和一些不同之处,并且这些技术成功地识别了不同的影响因素。卡尔曼滤波器在短期估计方面给出了令人满意的结果,但在长期预测方面表现不佳。总体而言,分析、图表、推断和预测都令人满意,并且在抗击病毒传播方面有很大帮助。