Koss Jonathan, Bohnet-Joschko Sabine
Department of Management and Entrepreneurship, Faculty of Management, Economics and Society, Witten/Herdecke University, Witten, Germany.
JMIR Form Res. 2022 Oct 3;6(10):e39582. doi: 10.2196/39582.
Since the beginning of the COVID-19 pandemic, over 480 million people have been infected and more than 6 million people have died from COVID-19 worldwide. In some patients with acute COVID-19, symptoms manifest over a longer period, which is also called "long-COVID." Unmet medical needs related to long-COVID are high, since there are no treatments approved. Patients experiment with various medications and supplements hoping to alleviate their suffering. They often share their experiences on social media.
The aim of this study was to explore the feasibility of social media mining methods to extract important compounds from the perspective of patients. The goal is to provide an overview of different medication strategies and important agents mentioned in Reddit users' self-reports to support hypothesis generation for drug repurposing, by incorporating patients' experiences.
We used named-entity recognition to extract substances representing medications or supplements used to treat long-COVID from almost 70,000 posts on the "/r/covidlonghaulers" subreddit. We analyzed substances by frequency, co-occurrences, and network analysis to identify important substances and substance clusters.
The named-entity recognition algorithm achieved an F1 score of 0.67. A total of 28,447 substance entities and 5789 word co-occurrence pairs were extracted. "Histamine antagonists," "famotidine," "magnesium," "vitamins," and "steroids" were the most frequently mentioned substances. Network analysis revealed three clusters of substances, indicating certain medication patterns.
This feasibility study indicates that network analysis can be used to characterize the medication strategies discussed in social media. Comparison with existing literature shows that this approach identifies substances that are promising candidates for drug repurposing, such as antihistamines, steroids, or antidepressants. In the context of a pandemic, the proposed method could be used to support drug repurposing hypothesis development by prioritizing substances that are important to users.
自新冠疫情开始以来,全球超过4.8亿人感染了新冠病毒,600多万人死于新冠病毒。在一些急性新冠患者中,症状会在较长时间内显现,这也被称为“长期新冠”。由于没有获批的治疗方法,与长期新冠相关的未满足医疗需求很高。患者尝试各种药物和补充剂,希望减轻痛苦。他们经常在社交媒体上分享自己的经历。
本研究的目的是探讨社交媒体挖掘方法从患者角度提取重要化合物的可行性。目标是通过纳入患者的经历,概述Reddit用户自我报告中提到的不同用药策略和重要药物,以支持药物再利用的假设生成。
我们使用命名实体识别从“/r/covidlonghaulers”子版块上近70000篇帖子中提取代表用于治疗长期新冠的药物或补充剂的物质。我们通过频率、共现和网络分析对物质进行分析,以识别重要物质和物质簇。
命名实体识别算法的F1分数为0.67。共提取了28447个物质实体和5789个词共现对。“组胺拮抗剂”、“法莫替丁”、“镁”、“维生素”和“类固醇”是最常被提及的物质。网络分析揭示了三个物质簇,表明了某些用药模式。
这项可行性研究表明,网络分析可用于描述社交媒体中讨论的用药策略。与现有文献的比较表明,这种方法识别出了有希望用于药物再利用的物质,如抗组胺药、类固醇或抗抑郁药。在疫情背景下,所提出的方法可用于通过优先考虑对用户重要的物质来支持药物再利用假设的开发。