文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

社交媒体中非医疗处方药物使用的自动检测的文本分类模型。

Text classification models for the automatic detection of nonmedical prescription medication use from social media.

机构信息

Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA.

Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.

出版信息

BMC Med Inform Decis Mak. 2021 Jan 26;21(1):27. doi: 10.1186/s12911-021-01394-0.


DOI:10.1186/s12911-021-01394-0
PMID:33499852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7835447/
Abstract

BACKGROUND: Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging-requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter. METHODS: We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority "abuse/misuse" class. RESULTS: Our proposed fusion-based model performs significantly better than the best traditional model (F-score [95% CI]: 0.67 [0.64-0.69] vs. 0.45 [0.42-0.48]). We illustrate, via experimentation using varying training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter. CONCLUSIONS: BERT, BERT-like and fusion-based models outperform traditional machine learning and deep learning models, achieving substantial improvements over many years of past research on the topic of prescription medication misuse/abuse classification from social media, which had been shown to be a complex task due to the unique ways in which information about nonmedical use is presented. Several challenges associated with the lack of context and the nature of social media language need to be overcome to further improve BERT and BERT-like models. These experimental driven challenges are represented as potential future research directions.

摘要

背景:处方药物(PM)的滥用已成为美国的全国性危机,社交媒体已被提议作为主动监测的潜在资源。然而,自动化社交媒体监测系统具有挑战性,需要先进的自然语言处理(NLP)和机器学习方法。在本文中,我们描述了从 Twitter 检测 PM 滥用自我报告的自动文本分类模型的开发和评估。

方法:我们尝试了基于双向转换器的最先进的语言模型,这些模型利用了能够进行迁移学习的推文级表示(例如 BERT、RoBERTa、XLNet、AlBERT 和 DistilBERT)、提出了基于融合的方法,并将所开发的模型与几种传统机器学习方法(包括深度学习)进行了比较。我们使用公共数据集评估了分类器在分类非主要“滥用/误用”类别的能力。

结果:我们提出的基于融合的模型的性能明显优于最佳传统模型(F 分数[95%置信区间]:0.67[0.64-0.69] 与 0.45[0.42-0.48])。通过使用不同的训练集大小进行实验,我们表明基于转换器的模型比其他模型更稳定,并且需要更少的注释数据。与过去的方法相比,我们表现最佳的分类模型取得的显著改进使其适用于从 Twitter 自动连续监测非医疗 PM 的使用。

结论:BERT、BERT 类模型和基于融合的模型优于传统机器学习和深度学习模型,在过去多年社交媒体上处方药滥用/误用分类的研究中取得了实质性的改进,由于非医疗使用信息呈现的独特方式,该研究被证明是一项复杂的任务。需要克服与缺乏上下文和社交媒体语言性质相关的几个挑战,以进一步改进 BERT 和 BERT 类模型。这些由实验驱动的挑战代表了潜在的未来研究方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c2df11ffa393/12911_2021_1394_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/3f0b7bbbecd9/12911_2021_1394_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/7305291fe733/12911_2021_1394_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c70e304ab7de/12911_2021_1394_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c6920b0b8339/12911_2021_1394_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/6466f1e94695/12911_2021_1394_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c2df11ffa393/12911_2021_1394_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/3f0b7bbbecd9/12911_2021_1394_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/7305291fe733/12911_2021_1394_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c70e304ab7de/12911_2021_1394_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c6920b0b8339/12911_2021_1394_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/6466f1e94695/12911_2021_1394_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ab8/7836501/c2df11ffa393/12911_2021_1394_Fig6_HTML.jpg

相似文献

[1]
Text classification models for the automatic detection of nonmedical prescription medication use from social media.

BMC Med Inform Decis Mak. 2021-1-26

[2]
Promoting Reproducible Research for Characterizing Nonmedical Use of Medications Through Data Annotation: Description of a Twitter Corpus and Guidelines.

J Med Internet Res. 2020-2-26

[3]
Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework.

J Am Med Inform Assoc. 2020-2-1

[4]
Detecting Potentially Harmful and Protective Suicide-Related Content on Twitter: Machine Learning Approach.

J Med Internet Res. 2022-8-17

[5]
Machine Learning and Natural Language Processing for Geolocation-Centric Monitoring and Characterization of Opioid-Related Social Media Chatter.

JAMA Netw Open. 2019-11-1

[6]
Momentary Depressive Feeling Detection Using X (Formerly Twitter) Data: Contextual Language Approach.

JMIR AI. 2023-11-27

[7]
Identifying Potential Lyme Disease Cases Using Self-Reported Worldwide Tweets: Deep Learning Modeling Approach Enhanced With Sentimental Words Through Emojis.

J Med Internet Res. 2023-10-16

[8]
Towards Transfer Learning Techniques-BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study.

Sensors (Basel). 2022-10-26

[9]
Utilizing a multi-class classification approach to detect therapeutic and recreational misuse of opioids on Twitter.

Comput Biol Med. 2021-2

[10]
Comparison of pretrained transformer-based models for influenza and COVID-19 detection using social media text data in Saskatchewan, Canada.

Front Digit Health. 2023-6-28

引用本文的文献

[1]
Automated Extraction of Mortality Information From Publicly Available Sources Using Large Language Models: Development and Evaluation Study.

J Med Internet Res. 2025-8-18

[2]
Monitoring the opioid epidemic via social media discussions.

NPJ Digit Med. 2025-5-15

[3]
Which social media platforms facilitate monitoring the opioid crisis?

PLOS Digit Health. 2025-4-28

[4]
"I Been Taking Adderall Mixing it With Lean, Hope I Don't Wake Up Out My Sleep": Harnessing Twitter to Understand Nonmedical Prescription Stimulant Use among Black Women and Men Subscribers.

medRxiv. 2024-12-5

[5]
Task-Specific Transformer-Based Language Models in Health Care: Scoping Review.

JMIR Med Inform. 2024-11-18

[6]
Classification of Patients' Judgments of Their Physicians in Web-Based Written Reviews Using Natural Language Processing: Algorithm Development and Validation.

J Med Internet Res. 2024-8-1

[7]
#ChronicPain: Automated Building of a Chronic Pain Cohort from Twitter Using Machine Learning.

Health Data Sci. 2023

[8]
A framework for multi-faceted content analysis of social media chatter regarding non-medical use of prescription medications.

BMC Digit Health. 2023

[9]
Large-Scale Social Media Analysis Reveals Emotions Associated with Nonmedical Prescription Drug Use.

Health Data Sci. 2022

[10]
A review on Natural Language Processing Models for COVID-19 research.

Healthc Anal (N Y). 2022-11

本文引用的文献

[1]
COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model.

IEEE Access. 2020-7-28

[2]
Hate speech detection and racial bias mitigation in social media based on BERT model.

PLoS One. 2020-8-27

[3]
Promoting Reproducible Research for Characterizing Nonmedical Use of Medications Through Data Annotation: Description of a Twitter Corpus and Guidelines.

J Med Internet Res. 2020-2-26

[4]
Machine Learning and Natural Language Processing for Geolocation-Centric Monitoring and Characterization of Opioid-Related Social Media Chatter.

JAMA Netw Open. 2019-11-1

[5]
Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework.

J Am Med Inform Assoc. 2020-2-1

[6]
An unsupervised and customizable misspelling generator for mining noisy health-related text sources.

J Biomed Inform. 2018-11-13

[7]
Sex differences in patterns of prescription opioid non-medical use among 10-18 year olds in the US.

Addict Behav. 2018-10-9

[8]
Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.

J Am Med Inform Assoc. 2018-10-1

[9]
Detection and Analysis of Drug Misuses. A Study Based on Social Media Messages.

Front Pharmacol. 2018-7-26

[10]
Candyflipping and Other Combinations: Identifying Drug-Drug Combinations from an Online Forum.

Front Psychiatry. 2018-4-30

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索