Department of Applied Cybernetics, Faculty of Science, University of Hradec Králové, 50003 Hradec Králové, Czech Republic.
Department of Mathematics, Faculty of Science, University of Hradec Králové, 50003 Hradec Králové, Czech Republic.
Sensors (Basel). 2021 Nov 15;21(22):7582. doi: 10.3390/s21227582.
The current population worldwide extensively uses social media to share thoughts, societal issues, and personal concerns. Social media can be viewed as an intelligent platform that can be augmented with a capability to analyze and predict various issues such as business needs, environmental needs, election trends (polls), governmental needs, etc. This has motivated us to initiate a comprehensive search of the COVID-19 pandemic-related views and opinions amongst the population on Twitter. The basic training data have been collected from Twitter posts. On this basis, we have developed research involving ensemble deep learning techniques to reach a better prediction of the future evolutions of views in Twitter when compared to previous works that do the same. First, feature extraction is performed through an N-gram stacked autoencoder supervised learning algorithm. The extracted features are then involved in a classification and prediction involving an ensemble fusion scheme of selected machine learning techniques such as decision tree (DT), support vector machine (SVM), random forest (RF), and K-nearest neighbour (KNN). all individual results are combined/fused for a better prediction by using both mean and mode techniques. Our proposed scheme of an N-gram stacked encoder integrated in an ensemble machine learning scheme outperforms all the other existing competing techniques such unigram autoencoder, bigram autoencoder, etc. Our experimental results have been obtained from a comprehensive evaluation involving a dataset extracted from open-source data available from Twitter that were filtered by using the keywords "covid", "covid19", "coronavirus", "covid-19", "sarscov2", and "covid_19".
目前,全球人口广泛使用社交媒体分享思想、社会问题和个人关注。社交媒体可以被视为一个智能平台,可以增强分析和预测各种问题的能力,如商业需求、环境需求、选举趋势(民意调查)、政府需求等。这促使我们开始在 Twitter 上全面搜索与 COVID-19 大流行相关的观点和意见。基本训练数据是从 Twitter 帖子中收集的。在此基础上,我们开发了涉及集成深度学习技术的研究,与之前的相同工作相比,我们可以更好地预测 Twitter 上观点的未来演变。首先,通过 N 元组堆叠自动编码器监督学习算法进行特征提取。然后,将提取的特征纳入涉及决策树 (DT)、支持向量机 (SVM)、随机森林 (RF) 和 K-最近邻 (KNN) 等选定机器学习技术的集成融合方案的分类和预测。通过使用均值和模式技术,将所有个体结果组合/融合以进行更好的预测。我们提出的 N 元组堆叠编码器集成到集成机器学习方案中的方案优于所有其他现有的竞争技术,例如单字自动编码器、双字自动编码器等。我们的实验结果是从涉及从 Twitter 上可用的开源数据提取的数据集的综合评估中获得的,该数据集是使用关键字“covid”、“covid19”、“coronavirus”、“covid-19”、“sarscov2”和“covid_19”过滤的。