• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用主题模型在社交媒体中发现健康话题。

Discovering health topics in social media using topic models.

作者信息

Paul Michael J, Dredze Mark

机构信息

Department of Computer Science and Center for Language and Speech Processing, Johns Hopkins University, Baltimore, Maryland, United States of America.

Department of Computer Science and Center for Language and Speech Processing, Johns Hopkins University, Baltimore, Maryland, United States of America; Human Language Technology Center of Excellence and Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America.

出版信息

PLoS One. 2014 Aug 1;9(8):e103408. doi: 10.1371/journal.pone.0103408. eCollection 2014.

DOI:10.1371/journal.pone.0103408
PMID:25084530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4118877/
Abstract

By aggregating self-reported health statuses across millions of users, we seek to characterize the variety of health information discussed in Twitter. We describe a topic modeling framework for discovering health topics in Twitter, a social media website. This is an exploratory approach with the goal of understanding what health topics are commonly discussed in social media. This paper describes in detail a statistical topic model created for this purpose, the Ailment Topic Aspect Model (ATAM), as well as our system for filtering general Twitter data based on health keywords and supervised classification. We show how ATAM and other topic models can automatically infer health topics in 144 million Twitter messages from 2011 to 2013. ATAM discovered 13 coherent clusters of Twitter messages, some of which correlate with seasonal influenza (r = 0.689) and allergies (r = 0.810) temporal surveillance data, as well as exercise (r =  .534) and obesity (r =  -.631) related geographic survey data in the United States. These results demonstrate that it is possible to automatically discover topics that attain statistically significant correlations with ground truth data, despite using minimal human supervision and no historical data to train the model, in contrast to prior work. Additionally, these results demonstrate that a single general-purpose model can identify many different health topics in social media.

摘要

通过汇总数百万用户自我报告的健康状况,我们试图描绘推特上所讨论的各类健康信息。我们描述了一个用于在社交媒体网站推特上发现健康主题的主题建模框架。这是一种探索性方法,目标是了解社交媒体中通常讨论哪些健康主题。本文详细介绍了为此目的创建的一个统计主题模型——疾病主题方面模型(ATAM),以及我们基于健康关键词和监督分类对一般推特数据进行筛选的系统。我们展示了ATAM和其他主题模型如何自动从2011年至2013年的1.44亿条推特消息中推断出健康主题。ATAM发现了13个连贯的推特消息集群,其中一些与季节性流感(r = 0.689)和过敏(r = 0.810)的时间监测数据相关,以及与美国的运动(r = 0.534)和肥胖(r = -0.631)相关的地理调查数据相关。这些结果表明,尽管与先前的工作相比,使用了最少的人工监督且没有历史数据来训练模型,但仍有可能自动发现与真实数据具有统计学显著相关性的主题。此外,这些结果表明,一个通用模型可以识别社交媒体中的许多不同健康主题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/5bae247251dd/pone.0103408.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/e3b0fd1cb24c/pone.0103408.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/cb28e8049a80/pone.0103408.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/01b6343554c1/pone.0103408.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/5bae247251dd/pone.0103408.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/e3b0fd1cb24c/pone.0103408.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/cb28e8049a80/pone.0103408.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/01b6343554c1/pone.0103408.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aea/4118877/5bae247251dd/pone.0103408.g004.jpg

相似文献

1
Discovering health topics in social media using topic models.使用主题模型在社交媒体中发现健康话题。
PLoS One. 2014 Aug 1;9(8):e103408. doi: 10.1371/journal.pone.0103408. eCollection 2014.
2
Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter.社交媒体挖掘在出生缺陷研究中的应用:一种基于规则和自举的方法,用于在 Twitter 上收集罕见健康相关事件的数据。
J Biomed Inform. 2018 Nov;87:68-78. doi: 10.1016/j.jbi.2018.10.001. Epub 2018 Oct 4.
3
Mining Twitter to assess the determinants of health behavior toward human papillomavirus vaccination in the United States.利用 Twitter 评估美国针对人乳头瘤病毒疫苗接种的健康行为的决定因素。
J Am Med Inform Assoc. 2020 Feb 1;27(2):225-235. doi: 10.1093/jamia/ocz191.
4
Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection.使用主题建模和社区检测来刻画关于HPV疫苗的推特讨论。
J Med Internet Res. 2016 Aug 29;18(8):e232. doi: 10.2196/jmir.6045.
5
Topics and Sentiment Surrounding Vaping on Twitter and Reddit During the 2019 e-Cigarette and Vaping Use-Associated Lung Injury Outbreak: Comparative Study.主题和情绪围绕着 2019 年电子烟和蒸气相关肺损伤爆发期间 Twitter 和 Reddit 上的蒸气:比较研究。
J Med Internet Res. 2022 Dec 13;24(12):e39460. doi: 10.2196/39460.
6
Using Social Media Data to Understand the Impact of Promotional Information on Laypeople's Discussions: A Case Study of Lynch Syndrome.利用社交媒体数据了解宣传信息对普通民众讨论的影响:以林奇综合征为例
J Med Internet Res. 2017 Dec 13;19(12):e414. doi: 10.2196/jmir.9266.
7
An unsupervised machine learning model for discovering latent infectious diseases using social media data.一种使用社交媒体数据发现潜在传染病的无监督机器学习模型。
J Biomed Inform. 2017 Feb;66:82-94. doi: 10.1016/j.jbi.2016.12.007. Epub 2016 Dec 26.
8
Leveraging graph topology and semantic context for pharmacovigilance through twitter-streams.通过推特流利用图拓扑结构和语义上下文进行药物警戒
BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):335. doi: 10.1186/s12859-016-1220-5.
9
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.
10
Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study.公众对 Twitter 上 COVID-19 大流行的看法:情感分析和主题建模研究。
JMIR Public Health Surveill. 2020 Nov 11;6(4):e21978. doi: 10.2196/21978.

引用本文的文献

1
Exploring Topics, Emotions, and Sentiments in Health Organization Posts and Public Responses on Instagram: Content Analysis.探索健康组织在Instagram上发布的内容以及公众回应中的主题、情感和情绪:内容分析
JMIR Infodemiology. 2025 May 2;5:e70576. doi: 10.2196/70576.
2
Impact of a Virtual Reality Video ("A Walk-Through Dementia") on YouTube Users: Topic Modeling Analysis.虚拟现实视频(《漫步痴呆症》)对YouTube用户的影响:主题建模分析
JMIR Form Res. 2025 Apr 16;9:e67755. doi: 10.2196/67755.
3
Robust language-based mental health assessments in time and space through social media.

本文引用的文献

1
Could behavioral medicine lead the web data revolution?行为医学能引领网络数据革命吗?
JAMA. 2014 Apr 9;311(14):1399-400. doi: 10.1001/jama.2014.1505.
2
A practical approach for content mining of Tweets.一种用于挖掘微博客内容的实用方法。
Am J Prev Med. 2013 Jul;45(1):122-129. doi: 10.1016/j.amepre.2013.02.025.
3
Decoding twitter: Surveillance and trends for cardiac arrest and resuscitation communication.解码推特:心脏骤停和复苏交流的监测和趋势。
通过社交媒体在时间和空间上进行基于语言的强大心理健康评估。
NPJ Digit Med. 2024 May 2;7(1):109. doi: 10.1038/s41746-024-01100-0.
4
Using Natural Language Processing to Explore Social Media Opinions on Food Security: Sentiment Analysis and Topic Modeling Study.使用自然语言处理技术探索社交媒体对食品安全的看法:情感分析和主题建模研究。
J Med Internet Res. 2024 Mar 21;26:e47826. doi: 10.2196/47826.
5
Generating Contextual Variables From Web-Based Data for Health Research: Tutorial on Web Scraping, Text Mining, and Spatial Overlay Analysis.从基于网络的数据中为健康研究生成情境变量:网络爬虫、文本挖掘和空间叠加分析教程。
JMIR Public Health Surveill. 2024 Jan 8;10:e50379. doi: 10.2196/50379.
6
Linguistic Methodologies to Surveil the Leading Causes of Mortality: Scoping Review of Twitter for Public Health Data.语言方法监测主要死因:针对公共卫生数据的 Twitter 范围审查。
J Med Internet Res. 2023 Jun 12;25:e39484. doi: 10.2196/39484.
7
Examining ethno-racial attitudes of the public in Twitter discourses related to the United States Supreme Court ruling: A machine learning approach.审视推特话语中公众对与美国最高法院裁决相关的种族态度:一种机器学习方法。
Front Glob Womens Health. 2023 May 4;4:1149441. doi: 10.3389/fgwh.2023.1149441. eCollection 2023.
8
Public Attitudes Toward Anxiety Disorder on Sina Weibo: Content Analysis.公众对新浪微博上焦虑障碍的态度:内容分析。
J Med Internet Res. 2023 Apr 4;25:e45777. doi: 10.2196/45777.
9
Modeling Topics in DFA-Based Lemmatized Gujarati Text.基于 DFA 的词形还原 Gujarati 文本中的主题建模。
Sensors (Basel). 2023 Mar 1;23(5):2708. doi: 10.3390/s23052708.
10
Readability and topics of the German Health Web: Exploratory study and text analysis.德国健康网站的可读性和主题:探索性研究和文本分析。
PLoS One. 2023 Feb 10;18(2):e0281582. doi: 10.1371/journal.pone.0281582. eCollection 2023.
Resuscitation. 2013 Feb;84(2):206-12. doi: 10.1016/j.resuscitation.2012.10.017. Epub 2012 Oct 27.
4
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.随机松弛,吉布斯分布,以及贝叶斯图像恢复。
IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.
5
Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak.社交媒体和新闻媒体使人们能够在 2010 年海地霍乱疫情早期估计出疾病的流行模式。
Am J Trop Med Hyg. 2012 Jan;86(1):39-45. doi: 10.4269/ajtmh.2012.11-0597.
6
Online social networks and smoking cessation: a scientific research agenda.在线社交网络与戒烟:一项科研议程。
J Med Internet Res. 2011 Dec 19;13(4):e119. doi: 10.2196/jmir.1911.
7
Associations between displayed alcohol references on Facebook and problem drinking among college students.脸书上展示的酒精相关内容与大学生饮酒问题之间的关联。
Arch Pediatr Adolesc Med. 2012 Feb;166(2):157-63. doi: 10.1001/archpediatrics.2011.180. Epub 2011 Oct 3.
8
Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures.在不同的文化中,昼夜节律和季节性情绪会随工作、睡眠和日照时间的变化而变化。
Science. 2011 Sep 30;333(6051):1878-81. doi: 10.1126/science.1202775.
9
Public health surveillance of dental pain via Twitter.通过 Twitter 进行口腔疼痛的公共卫生监测。
J Dent Res. 2011 Sep;90(9):1047-51. doi: 10.1177/0022034511415273. Epub 2011 Jul 18.
10
Finding complex biological relationships in recent PubMed articles using Bio-LDA.利用 Bio-LDA 在最近的 PubMed 文章中发现复杂的生物学关系。
PLoS One. 2011 Mar 23;6(3):e17243. doi: 10.1371/journal.pone.0017243.