Lee Juhan, Ouellette Rachel R, Murthy Dhiraj, Pretzer Ben, Anand Tanvi, Kong Grace
Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA.
School of Journalism and Media, University of Texas at Austin, Austin, TX, USA.
Nicotine Tob Res. 2024 Dec 23;27(1):91-96. doi: 10.1093/ntr/ntae171.
The use of hashtags is a common way to promote e-cigarette content on social media. Analysis of hashtags may provide insight into e-cigarette promotion on social media. However, the examination of text data is complicated by the voluminous amount of social media data. This study used machine learning approaches (ie, Bidirectional Encoder Representations from Transformers [BERT] topic modeling) to identify e-cigarette content on TikTok.
We used 13 unique hashtags related to e-cigarettes (eg, #vape) for data collection. The final analytic sample included 12 573 TikTok posts. To identify the best fitting number of topic clusters, we used both quantitative (ie, coherence test) and qualitative approaches (ie, researchers checked the relevance of text from each topic). We, then, grouped and characterized clustered text for each theme.
We evaluated that N = 18 was the ideal number of topic clusters. The 9 overarching themes were identified: Social media and TikTok-related features (N = 4; "duet," "viral"), Vape shops and brands (N = 3; "store"), Vape tricks (N = 3; "ripsaw"), Modified use of e-cigarettes (N = 1; "coil," "wire"), Vaping and girls (N = 1; "girl"), Vape flavors (N = 1; "flavors"), Vape and cigarettes (N = 1; "smoke"), Vape identities and communities (N = 1; "community"), and Non-English language (N = 3; Romanian and Spanish).
This study used a machine learning method, BERTopic modeling, to successfully identify relevant themes on TikTok. This method can inform future social media research examining other tobacco products, and tobacco regulatory policies such as monitoring of e-cigarette marketing on social media.
This study can inform future social media research examining other tobacco products, and tobacco regulatory policies such as monitoring of e-cigarette marketing on social media.
使用主题标签是在社交媒体上推广电子烟内容的常见方式。对主题标签的分析可以洞察社交媒体上的电子烟推广情况。然而,社交媒体数据量巨大,给文本数据分析带来了复杂性。本研究使用机器学习方法(即来自变换器的双向编码器表示[BERT]主题建模)来识别TikTok上的电子烟内容。
我们使用了13个与电子烟相关的独特主题标签(如#vape)进行数据收集。最终的分析样本包括12573条TikTok帖子。为了确定最合适的主题簇数量,我们使用了定量方法(即连贯性测试)和定性方法(即研究人员检查每个主题文本的相关性)。然后,我们对每个主题的聚类文本进行分组和特征描述。
我们评估出N = 18是理想的主题簇数量。确定了9个总体主题:与社交媒体和TikTok相关的特征(N = 4;“合拍”、“走红”)、电子烟商店和品牌(N = 3;“商店”)、电子烟技巧(N = 3;“拉锯”)、电子烟的改良使用(N = 1;“线圈”、“电线”)、吸电子烟与女孩(N = 1;“女孩”)、电子烟口味(N = 1;“口味”)、电子烟与香烟(N = 1;“吸烟”)、电子烟身份和社区(N = 1;“社区”)以及非英语语言(N = 3;罗马尼亚语和西班牙语)。
本研究使用机器学习方法BERTopic建模成功识别了TikTok上的相关主题。该方法可为未来研究其他烟草产品的社交媒体研究以及烟草监管政策(如监测社交媒体上的电子烟营销)提供参考。
本研究可为未来研究其他烟草产品的社交媒体研究以及烟草监管政策(如监测社交媒体上的电子烟营销)提供参考。