Gupta Himanshu, Verma Om Prakash
Department of Instrumentation and Control Engineering, Dr. B R Ambedkar National Institute of Technology, Jalandhar, India.
Evol Intell. 2023;16(3):739-757. doi: 10.1007/s12065-022-00704-3. Epub 2022 Mar 9.
The COVID-19 pandemic has badly affected people of all ages globally. Therefore, its vaccine has been developed and made available for public use in unprecedented times. However, because of various levels of hesitancy, it did not have general acceptance. The main objective of this work is to identify the risk associated with the COVID-19 vaccines by developing a prognosis tool that will help in enhancing its acceptability and therefore, reducing the lethality of SARS-CoV-2. The obtained raw VAERS dataset has three files indicating medical history, vaccination status, and post vaccination symptoms respectively with more than 354 thousand samples. After pre-processing, this raw dataset has been merged into one with 85 different attributes however, the whole analysis has been subdivided into three scenarios ((i) medical history (ii) reaction of vaccination (iii) combination of both). Further, Machine Learning (ML) models which includes Linear Regression (LR), Random Forest (RF), Naive Bayes (NB), Light Gradient Boosting Algorithm (LGBM), and Multilayer feed-forward perceptron (MLP) have been employed to predict the most probable outcome and their performance has been evaluated based on various performance parameters. Also, the chi-square (statistical), LR, RF, and LGBM have been utilized to estimate the most probable attribute in the dataset that resulted in death, hospitalization, and COVID-19. For the above mentioned scenarios, all the models estimates different attributes (such as cardiac arrest, Cancer, Hyperlipidemia, Kidney Disease, Diabetes, Atrial Fibrillation, Dementia, Thyroid, etc.) for death, hospitalization, and COVID-19 even after vaccination. Further, for prediction, LGBM outperforms all the other developed models in most of the scenarios whereas, LR, RF, NB, and MLP perform satisfactorily in patches. The male population in the age group of 50-70 has been found most susceptible to this virus. Also, people with existing serious illnesses have been found most vulnerable. Therefore, they must be vaccinated in close observations. Generally, no serious adverse effect of the vaccine has been observed therefore, people must vaccinate themselves without any hesitation at the earliest. Also, the model developed using LGBM establishes its supremacy over all the other prediction models. Therefore, it can be very helpful for the policymakers in administrating and prioritizing the population for the different vaccination programs.
新冠疫情对全球所有年龄段的人群都造成了严重影响。因此,其疫苗在前所未有的时期被研发出来并可供公众使用。然而,由于不同程度的犹豫态度,它并未得到普遍接受。这项工作的主要目标是通过开发一种预后工具来识别与新冠疫苗相关的风险,这将有助于提高其可接受性,从而降低新冠病毒的致死率。所获得的原始疫苗不良事件报告系统(VAERS)数据集有三个文件,分别指示病史、疫苗接种状态和接种后症状,样本超过35.4万个。经过预处理后,这个原始数据集被合并为一个包含85个不同属性的数据集,不过,整个分析被细分为三种情况((i)病史 (ii)疫苗接种反应 (iii)两者结合)。此外,机器学习(ML)模型,包括线性回归(LR)、随机森林(RF)、朴素贝叶斯(NB)、轻梯度提升算法(LGBM)和多层前馈神经网络(MLP),已被用于预测最可能的结果,并根据各种性能参数对其性能进行评估。此外,卡方检验(统计方法)、LR、RF和LGBM已被用于估计数据集中导致死亡、住院和感染新冠的最可能属性。对于上述情况,即使在接种疫苗后,所有模型针对死亡、住院和感染新冠估计的不同属性(如心脏骤停、癌症、高脂血症、肾脏疾病、糖尿病、心房颤动、痴呆、甲状腺等)。此外,在大多数情况下,对于预测而言,LGBM的表现优于所有其他已开发的模型,而LR、RF、NB和MLP在某些方面表现令人满意。已发现年龄在50至70岁的男性人群最易感染这种病毒。此外,已发现患有现有严重疾病的人最脆弱。因此,他们必须在密切观察下接种疫苗。一般来说,未观察到疫苗有严重的不良反应,因此,人们必须尽早毫不犹豫地接种疫苗。此外,使用LGBM开发的模型确立了其相对于所有其他预测模型的优势。因此,它对政策制定者管理不同的疫苗接种计划并对人群进行优先级排序可能非常有帮助。