Allem Jon-Patrick, Majmundar Anuja, Dormanesh Allison, Donaldson Scott I
Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States.
Department of Surveillance and Health Equity Science, American Cancer Society, Kennesaw, GA, United States.
JMIR Form Res. 2022 Feb 25;6(2):e35027. doi: 10.2196/35027.
The cannabis product and regulatory landscape is changing in the United States. Against the backdrop of these changes, there have been increasing reports on health-related motives for cannabis use and adverse events from its use. The use of social media data in monitoring cannabis-related health conversations may be useful to state- and federal-level regulatory agencies as they grapple with identifying cannabis safety signals in a comprehensive and scalable fashion.
This study attempted to determine the extent to which a medical dictionary-the Unified Medical Language System Consumer Health Vocabulary-could identify cannabis-related motivations for use and health consequences of cannabis use based on Twitter posts in 2020.
Twitter posts containing cannabis-related terms were obtained from January 1 to August 31, 2020. Each post from the sample (N=353,353) was classified into at least 1 of 17 a priori categories of common health-related topics by using a rule-based classifier. Each category was defined by the terms in the medical dictionary. A subsample of posts (n=1092) was then manually annotated to help validate the rule-based classifier and determine if each post pertained to health-related motivations for cannabis use, perceived adverse health effects from its use, or neither.
The validation process indicated that the medical dictionary could identify health-related conversations in 31.2% (341/1092) of posts. Specifically, 20.4% (223/1092) of posts were accurately identified as posts related to a health-related motivation for cannabis use, while 10.8% (118/1092) of posts were accurately identified as posts related to a health-related consequence from cannabis use. The health-related conversations about cannabis use included those about issues with the respiratory system, stress to the immune system, and gastrointestinal issues, among others.
The mining of social media data may prove helpful in improving the surveillance of cannabis products and their adverse health effects. However, future research needs to develop and validate a dictionary and codebook that capture cannabis use-specific health conversations on Twitter.
美国的大麻产品及监管环境正在发生变化。在这些变化的背景下,关于使用大麻的健康相关动机及其使用导致的不良事件的报道越来越多。在州和联邦层面的监管机构努力以全面且可扩展的方式识别大麻安全信号时,利用社交媒体数据监测与大麻相关的健康话题对话可能会有所帮助。
本研究试图确定一部医学词典——统一医学语言系统消费者健康词汇表——基于2020年的推特帖子识别与大麻使用相关的动机及大麻使用的健康后果的程度。
从2020年1月1日至8月31日获取包含大麻相关术语的推特帖子。使用基于规则的分类器将样本中的每条帖子(N = 353,353)至少归类到17个预先设定的常见健康相关主题类别中。每个类别由医学词典中的术语定义。然后对帖子的一个子样本(n = 1092)进行人工标注,以帮助验证基于规则的分类器,并确定每条帖子是否与使用大麻的健康相关动机、使用大麻后感知到的不良健康影响或两者都无关。
验证过程表明,医学词典能够在31.2%(341/1092)的帖子中识别出与健康相关的对话。具体而言,20.4%(223/1092)的帖子被准确识别为与使用大麻的健康相关动机有关的帖子,而10.8%(118/1092)的帖子被准确识别为与大麻使用的健康相关后果有关的帖子。关于大麻使用的与健康相关的对话包括那些关于呼吸系统问题、免疫系统压力以及胃肠道问题等。
挖掘社交媒体数据可能有助于改善对大麻产品及其不良健康影响的监测。然而,未来的研究需要开发并验证一部能够捕捉推特上特定于大麻使用的健康对话的词典和编码手册。