Faculty of Computer Engineering, Shahrood University of Technology, Shahrood 3619995161, Iran.
J Biomed Inform. 2021 Sep;121:103862. doi: 10.1016/j.jbi.2021.103862. Epub 2021 Jul 3.
It has not been long since a new disease called COVID-19 has hit the international community. Unknown nature of the virus, evidence of its adaptability and survival in new conditions, its widespread prevalence and also lengthy recovery period, along with daily notifications of new infection and fatality statistics, have created a wave of fear and anxiety among the public community and authorities. These factors have led to extreme changes in the social discourse in a rather short period of time. The analysis of this discourse is important to reconcile the society and restore ordinary conditions of mental peace and health. Although much research has been done on the disease since its international pandemic, the sociological analysis of the recent public phenomenon, especially in developing countries, still needs attention. We propose a framework for analyzing social media data and news stories oriented around COVID-19 disease. Our research is based on an extensive Persian data set gathered from different social media networks and news agencies in the period of January 21-April 29, 2020. We use the Latent Dirichlet Allocation (LDA) model and dynamic topic modeling to understand and capture the change of discourse in terms of temporal subjects. We scrutinize the reasons of subject alternations by exploring the related events and adopted practices and policies. The social discourse can highly affect the community morale and polarization. Therefore, we further analyze the polarization in online social media posts, and detect points of concept drift in the stream. Based on the analyzed content, effective guidelines are extracted to shift polarization towards positive. The results show that the proposed framework is able to provide an effective practical approach for cause and effect analysis of the social discourse.
一种名为 COVID-19 的新疾病在国际社会流行还没多久。该病毒性质未知,有证据表明其在新环境中具有适应性和生存能力,其广泛传播和漫长的恢复期,以及每天通报的新感染和死亡统计数据,在公众和当局中引发了一波恐惧和焦虑。这些因素导致社会话语在相当短的时间内发生了极端变化。分析这种话语对于协调社会和恢复精神平静与健康的正常状态很重要。尽管自该疾病在国际上流行以来,已经对其进行了大量研究,但对最近公众现象的社会学分析,特别是在发展中国家,仍然需要关注。我们提出了一个分析围绕 COVID-19 疾病的社交媒体数据和新闻报道的框架。我们的研究基于 2020 年 1 月 21 日至 4 月 29 日期间从不同社交媒体网络和新闻机构收集的广泛的波斯语数据集。我们使用潜在狄利克雷分配(LDA)模型和动态主题建模来理解和捕捉随着时间推移主题的话语变化。我们通过探索相关事件和采用的实践和政策来仔细研究主题更替的原因。社会话语可能会极大地影响社区士气和两极分化。因此,我们进一步分析了在线社交媒体帖子中的两极分化,并检测到流中的概念漂移点。基于分析的内容,提取了有效的指导方针,以将两极分化推向积极方向。结果表明,所提出的框架能够为社会话语的因果分析提供有效的实际方法。