Supply Chain and Information Management Group, D'Amore-McKim School of Business, Northeastern University, Boston, MA, United States.
College of Business and Information Systems, Dakota State University, Madiosn, SD, United States.
J Med Internet Res. 2020 Aug 13;22(8):e18350. doi: 10.2196/18350.
Social media are considered promising and viable sources of data for gaining insights into various disease conditions and patients' attitudes, behaviors, and medications. They can be used to recognize communication and behavioral themes of problematic use of prescription drugs. However, mining and analyzing social media data have challenges and limitations related to topic deduction and data quality. As a result, we need a structured approach to analyze social media content related to drug abuse in a manner that can mitigate the challenges and limitations surrounding the use of such data.
This study aimed to develop and evaluate a framework for mining and analyzing social media content related to drug abuse. The framework is designed to mitigate challenges and limitations related to topic deduction and data quality in social media data analytics for drug abuse.
The proposed framework started with defining different terms related to the keywords, categories, and characteristics of the topic of interest. We then used the Crimson Hexagon platform to collect data based on a search query informed by a drug abuse ontology developed using the identified terms. We subsequently preprocessed the data and examined the quality using an evaluation matrix. Finally, a suitable data analysis approach could be used to analyze the collected data.
The framework was evaluated using the opioid epidemic as a drug abuse case analysis. We demonstrated the applicability of the proposed framework to identify public concerns toward the opioid epidemic and the most discussed topics on social media related to opioids. The results from the case analysis showed that the framework could improve the discovery and identification of topics in social media domains characterized by a plethora of highly diverse terms and lack of a commonly available dictionary or language by the community, such as in the case of opioid and drug abuse.
The proposed framework addressed the challenges related to topic detection and data quality. We demonstrated the applicability of the proposed framework to identify the common concerns toward the opioid epidemic and the most discussed topics on social media related to opioids.
社交媒体被认为是获取各种疾病状况和患者态度、行为和用药信息的有前途且可行的数据源。它们可用于识别处方药物滥用的交流和行为主题。然而,挖掘和分析社交媒体数据在主题推导和数据质量方面存在挑战和限制。因此,我们需要一种结构化的方法来分析与药物滥用相关的社交媒体内容,以减轻围绕使用此类数据的挑战和限制。
本研究旨在开发和评估一种挖掘和分析与药物滥用相关的社交媒体内容的框架。该框架旨在减轻与社交媒体数据分析中与主题推导和数据质量相关的挑战和限制,以用于药物滥用。
所提出的框架首先定义了与关键词、类别和感兴趣主题的特征相关的不同术语。然后,我们使用 Crimson Hexagon 平台根据使用已识别术语开发的药物滥用本体论所告知的搜索查询来收集数据。随后,我们预处理了数据并使用评估矩阵检查了数据质量。最后,可以使用合适的数据分析方法来分析收集的数据。
该框架使用药物滥用案例分析(如阿片类药物流行)进行了评估。我们证明了该框架能够识别公众对阿片类药物流行的关注以及与阿片类药物相关的社交媒体上讨论最多的话题。案例分析的结果表明,该框架可以改善对社交媒体领域中存在大量高度多样化术语且社区缺乏常用字典或语言的主题的发现和识别,如阿片类药物和药物滥用的情况。
所提出的框架解决了主题检测和数据质量方面的挑战。我们证明了该框架能够识别公众对阿片类药物流行的关注以及与阿片类药物相关的社交媒体上讨论最多的话题。