Department of Psychological Medicine, Institute of Psychiatry Psychology and Neuroscience, King's College London, London, UK
South London and Maudsley NHS Foundation Trust, London, UK.
BMJ Open. 2021 Nov 5;11(11):e056601. doi: 10.1136/bmjopen-2021-056601.
Online health forums provide rich and untapped real-time data on population health. Through novel data extraction and natural language processing (NLP) techniques, we characterise the evolution of mental and physical health concerns relating to the COVID-19 pandemic among online health forum users.
We obtained data from three leading online health forums: HealthBoards, Inspire and HealthUnlocked, from the period 1 January 2020 to 31 May 2020. Using NLP, we analysed the content of posts related to COVID-19.
(1) Proportion of forum posts containing COVID-19 keywords; (2) proportion of forum users making their very first post about COVID-19; (3) proportion of COVID-19-related posts containing content related to physical and mental health comorbidities.
Data from 739 434 posts created by 53 134 unique users were analysed. A total of 35 581 posts (4.8%) contained a COVID-19 keyword. Posts discussing COVID-19 and related comorbid disorders spiked in early March to mid-March around the time of global implementation of lockdowns prompting a large number of users to post on online health forums for the first time. Over a quarter of COVID-19-related thread titles mentioned a physical or mental health comorbidity.
We demonstrate that it is feasible to characterise the content of online health forum user posts regarding COVID-19 and measure changes over time. The pandemic and corresponding public response has had a significant impact on posters' queries regarding mental health. Social media data sources such as online health forums can be harnessed to strengthen population-level mental health surveillance.
在线健康论坛提供了丰富且未开发的实时人口健康数据。通过新颖的数据提取和自然语言处理 (NLP) 技术,我们描述了在线健康论坛用户与 COVID-19 大流行相关的心理健康和身体健康问题的演变。
我们从三个领先的在线健康论坛(HealthBoards、Inspire 和 HealthUnlocked)获取了 2020 年 1 月 1 日至 2020 年 5 月 31 日的数据。我们使用 NLP 分析了与 COVID-19 相关的帖子内容。
(1)包含 COVID-19 关键字的论坛帖子比例;(2)首次发布 COVID-19 相关帖子的论坛用户比例;(3)与 COVID-19 相关的帖子中包含身体和心理健康合并症相关内容的比例。
分析了来自 53134 位唯一用户的 739434 个帖子的数据。共有 35581 个帖子(4.8%)包含 COVID-19 关键字。在全球实施封锁期间,大约在 3 月初至 3 月中旬,有关 COVID-19 和相关合并症的帖子数量激增,促使大量用户首次在在线健康论坛上发布帖子。超过四分之一的 COVID-19 相关主题标题提到了身体或心理健康合并症。
我们证明了对在线健康论坛用户关于 COVID-19 的帖子内容进行特征描述并衡量随时间变化是可行的。大流行和相应的公众反应对海报查询心理健康产生了重大影响。社交媒体数据源(如在线健康论坛)可用于加强人群心理健康监测。