College of Health and Biomedicine, Royal Melbourne Institute of Technology (RMIT), Australia.
Institute for Health and Sport, Victoria University, Melbourne, Australia.
Psychiatry Res. 2023 Dec;330:115579. doi: 10.1016/j.psychres.2023.115579. Epub 2023 Nov 3.
Text analyses of social media posts are a promising source of mental health information. This study used natural language processing to explore distinct language patterns on Twitter related to self-reported anxiety diagnosis.
A total of 233.000 tweets made by 605 users (300 reporting anxiety diagnosis and 305 not) over six months were comparatively analysed, considering user behavior, Linguistic Inquiry Word Count (LIWC), and sentiment analysis. Twitter users with a self-disclosed diagnosis of anxiety were classified as 'anxious' to facilitate group comparisons.
Supervised machine learning models showed a high prediction accuracy (Naïve Bayes 81.1 %, Random Forests 79.8 %, and LASSO-regression 79.4 %) in identifying Twitter users' self-disclosed diagnosis of anxiety. Additionally, a Latent Profile Analysis (LPA) identified four profiles characterized by high sentiment (31 % anxious participants), low sentiment (68 % anxious), self-immersed (80 % anxious), and normative behavior (38 % anxious).
The digital footprint of self-disclosed anxiety on Twitter posts presented a high frequency of words conveying either negative sentiment, a low frequency of positive sentiment, a reduced frequency of posting, and lengthier texts. These distinct patterns enabled highly accurate prediction of anxiety diagnosis. On this basis, appropriately resourced, awareness raising, online mental health campaigns are advocated.
社交媒体帖子的文本分析是获取心理健康信息的一种很有前景的方法。本研究使用自然语言处理技术,探索了与自我报告的焦虑症诊断相关的 Twitter 上的独特语言模式。
对六个月内 605 名用户(300 名报告焦虑症诊断,305 名未报告)发布的 23.3 万条推文进行了比较分析,考虑了用户行为、语言查询词频(LIWC)和情感分析。将自我报告有焦虑症诊断的 Twitter 用户分类为“焦虑”,以方便组间比较。
监督机器学习模型在识别 Twitter 用户自我报告的焦虑症诊断方面具有较高的预测准确性(朴素贝叶斯 81.1%、随机森林 79.8%和 LASSO 回归 79.4%)。此外,潜在剖面分析(LPA)确定了四个特征为高情绪(31%的焦虑参与者)、低情绪(68%的焦虑参与者)、自我沉溺(80%的焦虑参与者)和规范行为(38%的焦虑参与者)的特征。
Twitter 帖子中自我报告的焦虑症的数字足迹表现出高频表达负面情绪的词语,低频表达积极情绪的词语,发布频率降低,以及文本更长。这些独特的模式能够高度准确地预测焦虑症诊断。在此基础上,提倡开展有适当资源支持的、提高认识的在线心理健康运动。