Kim Myeong Gyu, Kim Jungu, Kim Su Cheol, Jeong Jaegwon
Graduate School of Clinical Pharmacy, CHA University, Pocheon, Republic of Korea.
Department of Psychiatry, Anam Hospital, Seoul, Republic of Korea.
J Med Internet Res. 2020 Feb 24;22(2):e16466. doi: 10.2196/16466.
Methylphenidate, a stimulant used to treat attention deficit hyperactivity disorder, has the potential to be used nonmedically, such as for studying and recreation. In an era when many people actively use social networking services, experience with the nonmedical use or side effects of methylphenidate might be shared on Twitter.
The purpose of this study was to analyze tweets about the nonmedical use and side effects of methylphenidate using a machine learning approach.
A total of 34,293 tweets mentioning methylphenidate from August 2018 to July 2019 were collected using searches for "methylphenidate" and its brand names. Tweets in a randomly selected training dataset (6860/34,293, 20.00%) were annotated as positive or negative for two dependent variables: nonmedical use and side effects. Features such as personal noun, nonmedical use terms, medical use terms, side effect terms, sentiment scores, and the presence of a URL were generated for supervised learning. Using the labeled training dataset and features, support vector machine (SVM) classifiers were built and the performance was evaluated using F scores. The classifiers were applied to the test dataset to determine the number of tweets about nonmedical use and side effects.
Of the 6860 tweets in the training dataset, 5.19% (356/6860) and 5.52% (379/6860) were about nonmedical use and side effects, respectively. Performance of SVM classifiers for nonmedical use and side effects, expressed as F scores, were 0.547 (precision: 0.926, recall: 0.388, and accuracy: 0.967) and 0.733 (precision: 0.920, recall: 0.609, and accuracy: 0.976), respectively. In the test dataset, the SVM classifiers identified 361 tweets (1.32%) about nonmedical use and 519 tweets (1.89%) about side effects. The proportion of tweets about nonmedical use was highest in May 2019 (46/2624, 1.75%) and December 2018 (36/2041, 1.76%).
The SVM classifiers that were built in this study were highly precise and accurate and will help to automatically identify the nonmedical use and side effects of methylphenidate using Twitter.
哌甲酯是一种用于治疗注意力缺陷多动障碍的兴奋剂,有被用于非医疗用途的可能,比如用于学习和娱乐。在一个许多人积极使用社交网络服务的时代,哌甲酯的非医疗用途或副作用的相关经历可能会在推特上分享。
本研究的目的是使用机器学习方法分析关于哌甲酯非医疗用途和副作用的推文。
通过搜索“哌甲酯”及其品牌名,收集了2018年8月至2019年7月期间共34293条提及哌甲酯的推文。在一个随机选择的训练数据集中(6860/34293,20.00%),针对两个因变量(非医疗用途和副作用)将推文标注为正面或负面。为监督学习生成了诸如人称名词、非医疗用途术语、医疗用途术语、副作用术语、情感得分以及是否存在网址等特征。利用标记的训练数据集和特征构建支持向量机(SVM)分类器,并使用F分数评估性能。将分类器应用于测试数据集以确定关于非医疗用途和副作用的推文数量。
在训练数据集中的6860条推文中分别有5.19%(356/6860)和5.52%(379/6860)是关于非医疗用途和副作用的。SVM分类器用于非医疗用途和副作用的性能,以F分数表示,分别为0.547(精确率:0.926,召回率:0.388,准确率:0.967)和0.733(精确率:0.920,召回率:0.609,准确率:0.976)。在测试数据集中,SVM分类器识别出361条(1.32%)关于非医疗用途的推文和519条(1.89%)关于副作用的推文。关于非医疗用途的推文比例在2叭9年5月(46/2624,1.75%)和2018年12月(36/2041,1.76%)最高。
本研究构建的SVM分类器具有很高的精确性和准确性,将有助于利用推特自动识别哌甲酯的非医疗用途和副作用。