• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

社交媒体(Reddit 和 Twitter)上的酒精相关内容的深度学习识别:酒精相关结果的探索性分析。

Deep Learning for Identification of Alcohol-Related Content on Social Media (Reddit and Twitter): Exploratory Analysis of Alcohol-Related Outcomes.

机构信息

Department of Biomedical Data Science, Dartmouth College, Lebanon, NH, United States.

Department of Epidemiology, Dartmouth College, Hanover, NH, United States.

出版信息

J Med Internet Res. 2021 Sep 15;23(9):e27314. doi: 10.2196/27314.

DOI:10.2196/27314
PMID:34524095
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8482254/
Abstract

BACKGROUND

Many social media studies have explored the ability of thematic structures, such as hashtags and subreddits, to identify information related to a wide variety of mental health disorders. However, studies and models trained on specific themed communities are often difficult to apply to different social media platforms and related outcomes. A deep learning framework using thematic structures from Reddit and Twitter can have distinct advantages for studying alcohol abuse, particularly among the youth in the United States.

OBJECTIVE

This study proposes a new deep learning pipeline that uses thematic structures to identify alcohol-related content across different platforms. We apply our method on Twitter to determine the association of the prevalence of alcohol-related tweets with alcohol-related outcomes reported from the National Institute of Alcoholism and Alcohol Abuse, Centers for Disease Control Behavioral Risk Factor Surveillance System, county health rankings, and the National Industry Classification System.

METHODS

The Bidirectional Encoder Representations From Transformers neural network learned to classify 1,302,524 Reddit posts as either alcohol-related or control subreddits. The trained model identified 24 alcohol-related hashtags from an unlabeled data set of 843,769 random tweets. Querying alcohol-related hashtags identified 25,558,846 alcohol-related tweets, including 790,544 location-specific (geotagged) tweets. We calculated the correlation between the prevalence of alcohol-related tweets and alcohol-related outcomes, controlling for confounding effects of age, sex, income, education, and self-reported race, as recorded by the 2013-2018 American Community Survey.

RESULTS

Significant associations were observed: between alcohol-hashtagged tweets and alcohol consumption (P=.01) and heavy drinking (P=.005) but not binge drinking (P=.37), self-reported at the metropolitan-micropolitan statistical area level; between alcohol-hashtagged tweets and self-reported excessive drinking behavior (P=.03) but not motor vehicle fatalities involving alcohol (P=.21); between alcohol-hashtagged tweets and the number of breweries (P<.001), wineries (P<.001), and beer, wine, and liquor stores (P<.001) but not drinking places (P=.23), per capita at the US county and county-equivalent level; and between alcohol-hashtagged tweets and all gallons of ethanol consumed (P<.001), as well as ethanol consumed from wine (P<.001) and liquor (P=.01) sources but not beer (P=.63), at the US state level.

CONCLUSIONS

Here, we present a novel natural language processing pipeline developed using Reddit's alcohol-related subreddits that identify highly specific alcohol-related Twitter hashtags. The prevalence of identified hashtags contains interpretable information about alcohol consumption at both coarse (eg, US state) and fine-grained (eg, metropolitan-micropolitan statistical area level and county) geographical designations. This approach can expand research and deep learning interventions on alcohol abuse and other behavioral health outcomes.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/0ff581855249/jmir_v23i9e27314_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/8821f75db4d1/jmir_v23i9e27314_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/952c063ced6c/jmir_v23i9e27314_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/e793932a6dfc/jmir_v23i9e27314_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/b9df1abd2a86/jmir_v23i9e27314_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/aa030c35128c/jmir_v23i9e27314_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/0ff581855249/jmir_v23i9e27314_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/8821f75db4d1/jmir_v23i9e27314_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/952c063ced6c/jmir_v23i9e27314_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/e793932a6dfc/jmir_v23i9e27314_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/b9df1abd2a86/jmir_v23i9e27314_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/aa030c35128c/jmir_v23i9e27314_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/8482254/0ff581855249/jmir_v23i9e27314_fig6.jpg
摘要

背景

许多社交媒体研究都探讨了主题结构(如标签和子版块)识别与各种心理健康障碍相关信息的能力。然而,在特定主题社区中进行研究和建模的方法通常难以应用于不同的社交媒体平台和相关结果。使用 Reddit 和 Twitter 中的主题结构的深度学习框架可以为研究酗酒问题提供独特的优势,尤其是在美国的年轻人中。

目的

本研究提出了一种新的深度学习管道,该管道使用主题结构来识别不同平台上的与酒精相关的内容。我们将我们的方法应用于 Twitter 上,以确定与酒精相关的推文的流行程度与从国家酒精滥用和酒精中毒研究所、疾病控制和预防中心行为风险因素监测系统、县健康排名和国家行业分类系统报告的酒精相关结果之间的关联。

方法

双向转换器表示学习的神经网络学会将 1302524 个 Reddit 帖子分类为与酒精相关或非酒精相关的子版块。经过训练的模型从 843769 个随机推文的未标记数据集中识别出 24 个与酒精相关的标签。查询与酒精相关的标签可识别出 25558846 条与酒精相关的推文,包括 790544 条位置特定(地理标记)的推文。我们计算了与酒精相关的推文的流行程度与酒精相关结果之间的相关性,控制了年龄、性别、收入、教育和自我报告的种族等混杂因素的影响,这些因素由 2013-2018 年美国社区调查记录。

结果

观察到以下显著关联:与酒精标签化推文和饮酒(P=.01)和酗酒(P=.005)相关,但与狂欢饮酒(P=.37)不相关,这是在都市区-大都市统计区层面上报告的;与酒精标签化推文和自我报告的过度饮酒行为(P=.03)相关,但与涉及酒精的机动车死亡事件(P=.21)不相关;与酒精标签化推文和啤酒厂(P<.001)、葡萄酒厂(P<.001)和啤酒、葡萄酒和白酒商店(P<.001)相关,但与饮酒场所(P=.23)不相关,以美国县和县级水平为单位;与酒精标签化推文和所有消耗的乙醇量(P<.001)以及消耗的乙醇来自葡萄酒(P<.001)和烈酒(P=.01)的来源,但不是啤酒(P=.63)相关,以美国州为单位。

结论

在这里,我们提出了一种使用 Reddit 的酒精相关子版块开发的新颖自然语言处理管道,该管道可识别特定的酒精相关 Twitter 标签。所识别的标签的流行程度包含了关于在粗粒度(例如,美国州)和细粒度(例如,都市区-大都市统计区层面和县级)地理指定位置的酒精消费的可解释信息。这种方法可以扩展对酗酒和其他行为健康结果的研究和深度学习干预。

相似文献

1
Deep Learning for Identification of Alcohol-Related Content on Social Media (Reddit and Twitter): Exploratory Analysis of Alcohol-Related Outcomes.社交媒体(Reddit 和 Twitter)上的酒精相关内容的深度学习识别:酒精相关结果的探索性分析。
J Med Internet Res. 2021 Sep 15;23(9):e27314. doi: 10.2196/27314.
2
Toward Using Twitter for Tracking COVID-19: A Natural Language Processing Pipeline and Exploratory Data Set.用于追踪 COVID-19 的 Twitter:自然语言处理管道和探索性数据集。
J Med Internet Res. 2021 Jan 22;23(1):e25314. doi: 10.2196/25314.
3
Detecting Potentially Harmful and Protective Suicide-Related Content on Twitter: Machine Learning Approach.在 Twitter 上检测潜在有害和保护自杀相关内容:机器学习方法。
J Med Internet Res. 2022 Aug 17;24(8):e34705. doi: 10.2196/34705.
4
Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study.利用推特数据监测身体活动水平:信息流行病学研究。
J Med Internet Res. 2019 Jun 3;21(6):e12394. doi: 10.2196/12394.
5
Topics and Sentiment Surrounding Vaping on Twitter and Reddit During the 2019 e-Cigarette and Vaping Use-Associated Lung Injury Outbreak: Comparative Study.主题和情绪围绕着 2019 年电子烟和蒸气相关肺损伤爆发期间 Twitter 和 Reddit 上的蒸气:比较研究。
J Med Internet Res. 2022 Dec 13;24(12):e39460. doi: 10.2196/39460.
6
Cultural Differences in Tweeting about Drinking Across the US.中美两国推特用户关于饮酒行为的文化差异。
Int J Environ Res Public Health. 2020 Feb 11;17(4):1125. doi: 10.3390/ijerph17041125.
7
Using Natural Language Processing to Explore "Dry January" Posts on Twitter: Longitudinal Infodemiology Study.使用自然语言处理技术探索 Twitter 上关于“干一月”的帖子:纵向信息流行病学研究。
J Med Internet Res. 2022 Nov 18;24(11):e40160. doi: 10.2196/40160.
8
Analysis of Twitter Activity and Engagement From Annual Meetings of the Society for Vascular Surgery and the Society of Interventional Radiology.分析血管外科学会和介入放射学会年会的推特活动和参与度。
Ann Vasc Surg. 2021 Oct;76:481-487. doi: 10.1016/j.avsg.2021.03.011. Epub 2021 Apr 5.
9
Machine Learning and Natural Language Processing for Geolocation-Centric Monitoring and Characterization of Opioid-Related Social Media Chatter.基于机器学习和自然语言处理的地理定位中心监测和特征描述阿片类药物相关社交媒体聊天。
JAMA Netw Open. 2019 Nov 1;2(11):e1914672. doi: 10.1001/jamanetworkopen.2019.14672.
10
How Health Care Workers Wield Influence Through Twitter Hashtags: Retrospective Cross-sectional Study of the Gun Violence and COVID-19 Public Health Crises.医疗工作者如何通过推特标签施加影响:枪支暴力和 COVID-19 公共卫生危机的回顾性横断面研究。
JMIR Public Health Surveill. 2021 Jan 6;7(1):e24562. doi: 10.2196/24562.

引用本文的文献

1
Stigma and Behavior Change Techniques in Substance Use Recovery: Qualitative Study of Social Media Narratives.物质使用康复中的耻辱感与行为改变技巧:社交媒体叙事的定性研究
JMIR Form Res. 2025 Mar 26;9:e57468. doi: 10.2196/57468.
2
Machine Learning-Based Prediction of Binge Drinking among Adults in the United State: Analysis of the 2022 Health Information National Trends Survey.基于机器学习的美国成年人暴饮行为预测:2022年健康信息国家趋势调查分析
Proc 2024 9th Int Conf Math Artif Intell (2024). 2024 May;2024:1-10. doi: 10.1145/3670085.3670090. Epub 2024 Aug 22.
3
Exploring Perceptions About Paracetamol, Tramadol, and Codeine on Twitter Using Machine Learning: Quantitative and Qualitative Observational Study.

本文引用的文献

1
Leading Topics in Twitter Discourse on JUUL and Puff Bar Products: Content Analysis.主题引领 Twitter 对 JUUL 和 Puff Bar 产品讨论:内容分析。
J Med Internet Res. 2021 Jul 19;23(7):e26510. doi: 10.2196/26510.
2
Social Media Content of Idiopathic Pulmonary Fibrosis Groups and Pages on Facebook: Cross-sectional Analysis.社交媒体中特发性肺纤维化群组和页面的内容:横断面分析。
JMIR Public Health Surveill. 2021 May 31;7(5):e24199. doi: 10.2196/24199.
3
Artificial Intelligence-Enabled Analysis of Public Attitudes on Facebook and Twitter Toward COVID-19 Vaccines in the United Kingdom and the United States: Observational Study.
使用机器学习探索关于对乙酰氨基酚、曲马多和可待因的 Twitter 认知:定量和定性观察性研究。
J Med Internet Res. 2023 Nov 14;25:e45660. doi: 10.2196/45660.
4
Investigating Substance Use via Reddit: Systematic Scoping Review.通过 Reddit 调查物质使用情况:系统范围综述。
J Med Internet Res. 2023 Oct 25;25:e48905. doi: 10.2196/48905.
5
Global prevalence and content of information about alcohol use as a cancer risk factor on Twitter.Twitter 上关于饮酒致癌风险因素的信息的全球流行率和内容。
Prev Med. 2023 Dec;177:107728. doi: 10.1016/j.ypmed.2023.107728. Epub 2023 Oct 14.
人工智能分析英美两国民众在脸书和推特上对 COVID-19 疫苗的态度:观察性研究。
J Med Internet Res. 2021 Apr 5;23(4):e26627. doi: 10.2196/26627.
4
The Nature and Extent of Online Marketing by Big Food and Big Alcohol During the COVID-19 Pandemic in Australia: Content Analysis Study.大食品和大酒精在澳大利亚 COVID-19 大流行期间的在线营销的性质和范围:内容分析研究。
JMIR Public Health Surveill. 2021 Mar 12;7(3):e25202. doi: 10.2196/25202.
5
Toward Using Twitter for Tracking COVID-19: A Natural Language Processing Pipeline and Exploratory Data Set.用于追踪 COVID-19 的 Twitter:自然语言处理管道和探索性数据集。
J Med Internet Res. 2021 Jan 22;23(1):e25314. doi: 10.2196/25314.
6
Association of Heavy Drinking With Deviant Fiber Tract Development in Frontal Brain Systems in Adolescents.青少年重度饮酒与额前脑系统异常纤维束发育的关联。
JAMA Psychiatry. 2021 Apr 1;78(4):407-415. doi: 10.1001/jamapsychiatry.2020.4064.
7
Social Media as a Research Tool (SMaaRT) for Risky Behavior Analytics: Methodological Review.社交媒体作为风险行为分析的研究工具(SMaaRT):方法学综述。
JMIR Public Health Surveill. 2020 Nov 30;6(4):e21660. doi: 10.2196/21660.
8
Usability of two brief questions as a screening tool for domestic violence and effect of #MeToo on prevalence of self-reported violence.两个简短问题作为家庭暴力筛查工具的可用性以及#MeToo对自我报告暴力发生率的影响。
Eur J Obstet Gynecol Reprod Biol. 2020 Dec;255:92-97. doi: 10.1016/j.ejogrb.2020.10.024. Epub 2020 Oct 16.
9
Associations Between Substance Use and Instagram Participation to Inform Social Network-Based Screening Models: Multimodal Cross-Sectional Study.物质使用与 Instagram 参与度的关联:基于社交网络的筛查模型的多模态横断面研究。
J Med Internet Res. 2020 Sep 16;22(9):e21916. doi: 10.2196/21916.
10
Hate speech detection and racial bias mitigation in social media based on BERT model.基于 BERT 模型的社交媒体中的仇恨言论检测和种族偏见缓解。
PLoS One. 2020 Aug 27;15(8):e0237861. doi: 10.1371/journal.pone.0237861. eCollection 2020.