Suppr超能文献

利用多类别分类方法在推特上检测阿片类药物的治疗性和娱乐性滥用情况。

Utilizing a multi-class classification approach to detect therapeutic and recreational misuse of opioids on Twitter.

作者信息

Fodeh Samah Jamal, Al-Garadi Mohammed, Elsankary Osama, Perrone Jeanmarie, Becker William, Sarker Abeed

机构信息

Department of Emergency Medicine, Yale School of Medicine, Yale University, New Haven, CT 06510, USA; VA Connecticut Healthcare System, West Haven, CT 06516, USA.

Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30322, USA.

出版信息

Comput Biol Med. 2021 Feb;129:104132. doi: 10.1016/j.compbiomed.2020.104132. Epub 2020 Nov 20.

Abstract

BACKGROUND

Opioid misuse (OM) is a major health problem in the United States, and can lead to addiction and fatal overdose. We sought to employ natural language processing (NLP) and machine learning to categorize Twitter chatter based on the motive of OM.

MATERIALS AND METHODS

We collected data from Twitter using opioid-related keywords, and manually annotated 6988 tweets into three classes-No-OM, Pain-related-OM, and Recreational-OM-with the No-OM class representing tweets indicating no use/misuse, and the Pain-related misuse and Recreational-misuse classes representing misuse for pain or recreation/addiction. We trained and evaluated multi-class classifiers, and performed term-level k-means clustering to assess whether there were terms closely associated with the three classes.

RESULTS

On a held-out test set of 1677 tweets, a transformer-based classifier (XLNet) achieved the best performance with F-score of 0.71 for the Pain-misuse class, and 0.79 for the Recreational-misuse class. Macro- and micro-averaged F-scores over all classes were 0.82 and 0.92, respectively. Content-analysis using clustering revealed distinct clusters of terms associated with each class.

DISCUSSION

While some past studies have attempted to automatically detect opioid misuse, none have further characterized the motive for misuse. Our multi-class classification approach using XLNet showed promising performance, including in detecting the subtle differences between pain-related and recreation-related misuse. The distinct clustering of class-specific keywords may help conduct targeted data collection, overcoming under-representation of minority classes.

CONCLUSION

Machine learning can help identify pain-related and recreational-related OM contents on Twitter to potentially enable the study of the characteristics of individuals exhibiting such behavior.

摘要

背景

阿片类药物滥用(OM)是美国的一个主要健康问题,可导致成瘾和致命的药物过量。我们试图利用自然语言处理(NLP)和机器学习,根据阿片类药物滥用的动机对推特上的聊天内容进行分类。

材料与方法

我们使用与阿片类药物相关的关键词从推特收集数据,并将6988条推文手动标注为三个类别——无阿片类药物滥用、疼痛相关的阿片类药物滥用和娱乐性阿片类药物滥用。无阿片类药物滥用类别代表表明未使用/未滥用的推文,疼痛相关滥用和娱乐性滥用类别代表用于止痛或娱乐/成瘾的滥用情况。我们训练并评估了多类别分类器,并进行了词级k均值聚类,以评估是否存在与这三个类别密切相关的词汇。

结果

在一个包含1677条推文的留出测试集上,基于Transformer的分类器(XLNet)表现最佳,疼痛滥用类别的F分数为0.71,娱乐性滥用类别的F分数为0.79。所有类别的宏平均F分数和微平均F分数分别为0.82和0.92。使用聚类进行的内容分析揭示了与每个类别相关的不同词汇簇。

讨论

虽然过去一些研究试图自动检测阿片类药物滥用,但没有一项研究进一步描述滥用的动机。我们使用XLNet的多类别分类方法显示出了有前景的性能,包括在检测疼痛相关和娱乐相关滥用之间的细微差异方面。特定类别的关键词的不同聚类可能有助于进行有针对性的数据收集,克服少数类别代表性不足的问题。

结论

机器学习有助于识别推特上与疼痛相关和娱乐相关的阿片类药物滥用内容,从而有可能对表现出此类行为的个体特征进行研究。

相似文献

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验