Alhazzaa Linah, Curcin Vasa
Department of Informatics, King's College London, London, United Kingdom.
Department of Computer Science, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
J Med Internet Res. 2025 Mar 20;27:e53399. doi: 10.2196/53399.
Despite a dramatic increase in the number of people with generalized anxiety disorder (GAD), a substantial number still do not seek help from health professionals, resulting in reduced quality of life. With the growth in popularity of social media platforms, individuals have become more willing to express their emotions through these channels. Therefore, social media data have become valuable for identifying mental health status.
This study investigated the social media posts and behavioral patterns of people with GAD, focusing on language use, emotional expression, topics discussed, and engagement to identify digital markers of GAD, such as anxious patterns and behaviors. These insights could help reveal mental health indicators, aiding in digital intervention development.
Data were first collected from Twitter (subsequently rebranded as X) for the GAD and control groups. Several preprocessing steps were performed. Three measurements were defined based on Linguistic Inquiry and Word Count for linguistic analysis. GuidedLDA was also used to identify the themes present in the tweets. Additionally, users' behaviors were analyzed using Twitter metadata. Finally, we studied the correlation between the GuidedLDA-based themes and users' behaviors.
The linguistic analysis indicated differences in cognitive style, personal needs, and emotional expressiveness between people with and without GAD. Regarding cognitive style, there were significant differences (P<.001) for all features, such as insight (Cohen d=1.13), causation (Cohen d=1.03), and discrepancy (Cohen d=1.16). Regarding personal needs, there were significant differences (P<.001) in most personal needs categories, such as curiosity (Cohen d=1.05) and communication (Cohen d=0.64). Regarding emotional expressiveness, there were significant differences (P<.001) for most features, including anxiety (Cohen d=0.62), anger (Cohen d=0.72), sadness (Cohen d=0.48), and swear words (Cohen d=2.61). Additionally, topic modeling identified 4 primary themes (ie, symptoms, relationships, life problems, and feelings). We found that all themes were significantly more prevalent for people with GAD than for those without GAD (P<.001), along with significant effect sizes (Cohen d>0.50; P<.001) for most themes. Moreover, studying users' behaviors, including hashtag participation, volume, interaction pattern, social engagement, and reactive behaviors, revealed some digital markers of GAD, with most behavior-based features, such as the hashtag (Cohen d=0.49) and retweet (Cohen d=0.69) ratios, being statistically significant (P<.001). Furthermore, correlations between the GuidedLDA-based themes and users' behaviors were also identified.
Our findings revealed several digital markers of GAD on social media. These findings are significant and could contribute to developing an assessment tool that clinicians could use for the initial diagnosis of GAD or the detection of an early signal of worsening in people with GAD via social media posts. This tool could provide ongoing support and personalized coping strategies. However, one limitation of using social media for mental health assessment is the lack of a demographic representativeness analysis.
尽管广泛性焦虑障碍(GAD)患者数量急剧增加,但仍有相当一部分人未寻求专业医疗帮助,导致生活质量下降。随着社交媒体平台的日益普及,人们更愿意通过这些渠道表达自己的情绪。因此,社交媒体数据对于识别心理健康状况具有重要价值。
本研究调查了GAD患者的社交媒体帖子和行为模式,重点关注语言使用、情感表达、讨论话题和参与度,以识别GAD的数字标记,如焦虑模式和行为。这些见解有助于揭示心理健康指标,为数字干预的发展提供帮助。
首先从推特(后更名为X)收集GAD组和对照组的数据。进行了几个预处理步骤。基于语言查询与字数统计定义了三个测量指标用于语言分析。还使用引导式潜在狄利克雷分配(GuidedLDA)来识别推文主题。此外,利用推特元数据对用户行为进行分析。最后,研究基于GuidedLDA的主题与用户行为之间的相关性。
语言分析表明,GAD患者与非GAD患者在认知风格、个人需求和情感表达方面存在差异。在认知风格方面,所有特征如洞察力(Cohen d = 1.13)、因果关系(Cohen d = 1.03)和差异(Cohen d = 1.16)均存在显著差异(P <.001)。在个人需求方面,大多数个人需求类别如好奇心(Cohen d = 1.05)和沟通(Cohen d = 0.64)存在显著差异(P <.001)。在情感表达方面,大多数特征如焦虑(Cohen d = 0.62)、愤怒(Cohen d = 0.72)、悲伤(Cohen d = 0.48)和脏话(Cohen d = 2.61)存在显著差异(P <.001)。此外,主题建模识别出4个主要主题(即症状、人际关系、生活问题和感受)。我们发现,所有主题在GAD患者中比非GAD患者更为普遍(P <.001),且大多数主题的效应量显著(Cohen d > 0.50;P <.001)。此外,对用户行为的研究,包括主题标签参与度、数量、互动模式、社交参与度和反应行为,揭示了一些GAD的数字标记,大多数基于行为的特征如主题标签(Cohen d = 0.49)和转发(Cohen d = 0.69)比例具有统计学意义(P <.001)。此外,还确定了基于GuidedLDA的主题与用户行为之间的相关性。
我们的研究结果揭示了社交媒体上GAD的几个数字标记。这些发现具有重要意义,有助于开发一种评估工具,临床医生可通过社交媒体帖子对GAD进行初步诊断或检测GAD患者病情恶化的早期信号。该工具可提供持续支持和个性化应对策略。然而,使用社交媒体进行心理健康评估的一个局限性是缺乏人口统计学代表性分析。