Smith Delaney A, Lavertu Adam, Salecha Aadesh, Hamamsy Tymor, Humphreys Keith, Lembke Anna, Kiang Mathew V, Altman Russ B, Eichstaedt Johannes C
Biochemistry Department, Stanford University School of Medicine, Stanford, CA, 94305, USA.
Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA.
NPJ Digit Med. 2025 May 15;8(1):284. doi: 10.1038/s41746-025-01642-x.
The opioid epidemic persists in the U.S., with over 80,000 deaths annually since 2021, primarily driven by synthetic opioids. Responding to this evolving epidemic requires reliable and timely information. One source of data is social media platforms. We assessed the utility of Reddit data for surveillance, covering heroin, prescription, and synthetic drugs. We built a natural language processing pipeline to identify opioid-related content and created a cohort of 1,689,039 Reddit users, each assigned to a state based on their previous Reddit activity. We measured their opioid-related posts over time and compared rates against CDC overdose and NFLIS report rates. To simulate the real-world prediction of synthetic opioid overdose rates, we added near real-time Reddit data to a model relying on CDC mortality data with a typical 6-month reporting lag. Reddit data significantly improved the prediction accuracy of overdose rates. This work suggests that social media can help monitor drug epidemics.
美国的阿片类药物泛滥问题依然存在,自2021年以来,每年有超过8万人死亡,主要由合成阿片类药物导致。应对这一不断演变的泛滥问题需要可靠且及时的信息。数据来源之一是社交媒体平台。我们评估了Reddit数据在监测方面的效用,涵盖海洛因、处方药和合成药物。我们构建了一个自然语言处理管道来识别与阿片类药物相关的内容,并创建了一个由1,689,039名Reddit用户组成的群组,每个用户根据其之前在Reddit上的活动被分配到一个州。我们测量了他们随时间推移发布的与阿片类药物相关的帖子,并将发帖率与疾病控制与预防中心(CDC)的过量用药报告率以及美国国家法医实验室信息系统(NFLIS)的报告率进行了比较。为了模拟对合成阿片类药物过量用药率的现实世界预测,我们将近乎实时的Reddit数据添加到一个依赖CDC死亡率数据的模型中,该数据存在典型的6个月报告延迟。Reddit数据显著提高了过量用药率的预测准确性。这项工作表明社交媒体有助于监测药物泛滥问题。