Warwick Medical School, University of Warwick, Coventry, UK.
Department of Economics, University of Warwick, Coventry, UK.
BMC Med Res Methodol. 2022 May 14;22(1):139. doi: 10.1186/s12874-022-01610-z.
Social media has led to fundamental changes in the way that people look for and share health related information. There is increasing interest in using this spontaneously generated patient experience data as a data source for health research. The aim was to summarise the state of the art regarding how and why SGOPE data has been used in health research. We determined the sites and platforms used as data sources, the purposes of the studies, the tools and methods being used, and any identified research gaps.
A scoping umbrella review was conducted looking at review papers from 2015 to Jan 2021 that studied the use of SGOPE data for health research. Using keyword searches we identified 1759 papers from which we included 58 relevant studies in our review.
Data was used from many individual general or health specific platforms, although Twitter was the most widely used data source. The most frequent purposes were surveillance based, tracking infectious disease, adverse event identification and mental health triaging. Despite the developments in machine learning the reviews included lots of small qualitative studies. Most NLP used supervised methods for sentiment analysis and classification. Very early days, methods need development. Methods not being explained. Disciplinary differences - accuracy tweaks vs application. There is little evidence of any work that either compares the results in both methods on the same data set or brings the ideas together.
Tools, methods, and techniques are still at an early stage of development, but strong consensus exists that this data source will become very important to patient centred health research.
社交媒体的出现导致人们寻找和分享健康相关信息的方式发生了根本性的变化。人们越来越有兴趣利用这种自发产生的患者体验数据作为健康研究的数据源。本研究旨在总结 SGOPE 数据在健康研究中应用的方法和原因。我们确定了用作数据源的网站和平台、研究的目的、使用的工具和方法,以及任何已确定的研究空白。
进行了范围综述,对 2015 年至 2021 年 1 月研究使用 SGOPE 数据进行健康研究的综述文章进行了分析。通过关键词搜索,我们从 1759 篇论文中筛选出 58 篇相关研究纳入综述。
数据来自许多个体通用或特定于健康的平台,尽管 Twitter 是最广泛使用的数据源。最常见的目的是基于监测,跟踪传染病、识别不良事件和进行心理健康分诊。尽管机器学习取得了进展,但综述中包含了许多小型定性研究。大多数 NLP 使用监督方法进行情感分析和分类。处于早期阶段,方法需要开发。方法未得到解释。学科差异——准确性调整与应用。几乎没有任何工作比较同一数据集上两种方法的结果,或者将这些想法结合起来。
工具、方法和技术仍处于早期发展阶段,但人们强烈认为,这种数据源将对以患者为中心的健康研究变得非常重要。