Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, Georgia, USA.
J Am Med Inform Assoc. 2020 Aug 1;27(8):1310-1315. doi: 10.1093/jamia/ocaa116.
To mine Twitter and quantitatively analyze COVID-19 symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon for future research.
We retrieved tweets using COVID-19-related keywords, and performed semiautomatic filtering to curate self-reports of positive-tested users. We extracted COVID-19-related symptoms mentioned by the users, mapped them to standard concept IDs in the Unified Medical Language System, and compared the distributions to those reported in early studies from clinical settings.
We identified 203 positive-tested users who reported 1002 symptoms using 668 unique expressions. The most frequently-reported symptoms were fever/pyrexia (66.1%), cough (57.9%), body ache/pain (42.7%), fatigue (42.1%), headache (37.4%), and dyspnea (36.3%) amongst users who reported at least 1 symptom. Mild symptoms, such as anosmia (28.7%) and ageusia (28.1%), were frequently reported on Twitter, but not in clinical studies.
The spectrum of COVID-19 symptoms identified from Twitter may complement those identified in clinical settings.
挖掘 Twitter 数据并对用户自报的 COVID-19 症状进行定量分析,比较不同研究中的症状分布情况,并为未来的研究创建症状词汇表。
我们使用 COVID-19 相关关键词检索推文,并进行半自动筛选以整理出经检测呈阳性的用户的自报症状。我们提取用户提到的与 COVID-19 相关的症状,并将其映射到统一医学语言系统中的标准概念 ID,然后将分布情况与早期临床研究中的报告进行比较。
我们确定了 203 名经检测呈阳性的用户,他们使用 668 个独特表达报告了 1002 种症状。报告至少 1 种症状的用户中最常见的症状是发热/高热(66.1%)、咳嗽(57.9%)、身体疼痛/疼痛(42.7%)、乏力(42.1%)、头痛(37.4%)和呼吸困难(36.3%)。在 Twitter 上经常报告一些轻度症状,例如嗅觉丧失(28.7%)和味觉丧失(28.1%),但在临床研究中并未报告。
从 Twitter 上确定的 COVID-19 症状谱可能与临床环境中确定的症状谱互补。