Murthy Dhiraj, Lee Juhan, Dashtian Hassan, Kong Grace
Computational Media Lab, School of Journalism and Media Moody College of Communication The University of Texas at Austin Austin, TX United States.
Department of Psychiatry Yale School of Medicine New Haven, CT United States.
JMIR Infodemiology. 2023 Apr 12;3:e42218. doi: 10.2196/42218. eCollection 2023.
The proliferation of e-cigarette content on YouTube is concerning because of its possible effect on youth use behaviors. YouTube has a personalized search and recommendation algorithm that derives attributes from a user's profile, such as age and sex. However, little is known about whether e-cigarette content is shown differently based on user characteristics.
The aim of this study was to understand the influence of age and sex attributes of user profiles on e-cigarette-related YouTube search results.
We created 16 fictitious YouTube profiles with ages of 16 and 24 years, sex (female and male), and ethnicity/race to search for 18 e-cigarette-related search terms. We used unsupervised (k-means clustering and classification) and supervised (graph convolutional network) machine learning and network analysis to characterize the variation in the search results of each profile. We further examined whether user attributes may play a role in e-cigarette-related content exposure by using networks and degree centrality.
We analyzed 4201 nonduplicate videos. Our k-means clustering suggested that the videos could be clustered into 3 categories. The graph convolutional network achieved high accuracy (0.72). Videos were classified based on content into 4 categories: product review (49.3%), health information (15.1%), instruction (26.9%), and other (8.5%). Underage users were exposed mostly to instructional videos (37.5%), with some indication that more female 16-year-old profiles were exposed to this content, while young adult age groups (24 years) were exposed mostly to product review videos (39.2%).
Our results indicate that demographic attributes factor into YouTube's algorithmic systems in the context of e-cigarette-related queries on YouTube. Specifically, differences in the age and sex attributes of user profiles do result in variance in both the videos presented in YouTube search results as well as in the types of these videos. We find that underage profiles were exposed to e-cigarette content despite YouTube's age-restriction policy that ostensibly prohibits certain e-cigarette content. Greater enforcement of policies to restrict youth access to e-cigarette content is needed.
YouTube上电子烟内容的激增令人担忧,因为这可能会对青少年的使用行为产生影响。YouTube拥有个性化的搜索和推荐算法,该算法会从用户资料中提取属性,如年龄和性别。然而,对于电子烟内容是否会因用户特征而呈现出不同,我们知之甚少。
本研究旨在了解用户资料中的年龄和性别属性对与电子烟相关的YouTube搜索结果的影响。
我们创建了16个虚拟的YouTube资料,年龄分别为16岁和24岁,涵盖性别(女性和男性)以及种族/民族,用于搜索18个与电子烟相关的搜索词。我们使用无监督(k均值聚类和分类)和有监督(图卷积网络)机器学习以及网络分析来刻画每个资料搜索结果的差异。我们还通过网络和度中心性进一步研究了用户属性是否可能在与电子烟相关的内容曝光中发挥作用。
我们分析了4201个不重复的视频。我们的k均值聚类表明,这些视频可以分为3类。图卷积网络达到了较高的准确率(0.72)。视频根据内容分为4类:产品评测(49.3%)、健康信息(15.1%)、使用说明(26.9%)和其他(8.5%)。未成年用户大多接触到使用说明类视频(37.5%),有迹象表明更多16岁女性资料接触到此类内容,而年轻成年人群体(24岁)大多接触到产品评测视频(39.2%)。
我们的结果表明,在YouTube上与电子烟相关查询的背景下,人口统计学属性因素会纳入YouTube的算法系统。具体而言,用户资料中年龄和性别属性的差异确实会导致YouTube搜索结果中呈现的视频以及这些视频的类型出现差异。我们发现,尽管YouTube有表面上禁止某些电子烟内容的年龄限制政策,但未成年资料仍接触到了电子烟内容。需要加强政策执行力度,以限制青少年获取电子烟内容。