Suppr超能文献

使用主题建模和社区检测来刻画关于HPV疫苗的推特讨论。

Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection.

作者信息

Surian Didi, Nguyen Dat Quoc, Kennedy Georgina, Johnson Mark, Coiera Enrico, Dunn Adam G

机构信息

Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, North Ryde, New South Wales, Australia.

出版信息

J Med Internet Res. 2016 Aug 29;18(8):e232. doi: 10.2196/jmir.6045.

Abstract

BACKGROUND

In public health surveillance, measuring how information enters and spreads through online communities may help us understand geographical variation in decision making associated with poor health outcomes.

OBJECTIVE

Our aim was to evaluate the use of community structure and topic modeling methods as a process for characterizing the clustering of opinions about human papillomavirus (HPV) vaccines on Twitter.

METHODS

The study examined Twitter posts (tweets) collected between October 2013 and October 2015 about HPV vaccines. We tested Latent Dirichlet Allocation and Dirichlet Multinomial Mixture (DMM) models for inferring topics associated with tweets, and community agglomeration (Louvain) and the encoding of random walks (Infomap) methods to detect community structure of the users from their social connections. We examined the alignment between community structure and topics using several common clustering alignment measures and introduced a statistical measure of alignment based on the concentration of specific topics within a small number of communities. Visualizations of the topics and the alignment between topics and communities are presented to support the interpretation of the results in context of public health communication and identification of communities at risk of rejecting the safety and efficacy of HPV vaccines.

RESULTS

We analyzed 285,417 Twitter posts (tweets) about HPV vaccines from 101,519 users connected by 4,387,524 social connections. Examining the alignment between the community structure and the topics of tweets, the results indicated that the Louvain community detection algorithm together with DMM produced consistently higher alignment values and that alignments were generally higher when the number of topics was lower. After applying the Louvain method and DMM with 30 topics and grouping semantically similar topics in a hierarchy, we characterized 163,148 (57.16%) tweets as evidence and advocacy, and 6244 (2.19%) tweets describing personal experiences. Among the 4548 users who posted experiential tweets, 3449 users (75.84%) were found in communities where the majority of tweets were about evidence and advocacy.

CONCLUSIONS

The use of community detection in concert with topic modeling appears to be a useful way to characterize Twitter communities for the purpose of opinion surveillance in public health applications. Our approach may help identify online communities at risk of being influenced by negative opinions about public health interventions such as HPV vaccines.

摘要

背景

在公共卫生监测中,衡量信息如何进入并在在线社区中传播,可能有助于我们理解与不良健康结果相关的决策中的地理差异。

目的

我们的目的是评估使用社区结构和主题建模方法作为一种表征推特上关于人乳头瘤病毒(HPV)疫苗的观点聚类的过程。

方法

该研究调查了2013年10月至2015年10月期间收集的关于HPV疫苗的推特帖子(推文)。我们测试了潜在狄利克雷分配模型和狄利克雷多项式混合(DMM)模型以推断与推文相关的主题,以及社区凝聚(鲁汶)和随机游走编码(信息地图)方法,以从用户的社交联系中检测用户的社区结构。我们使用几种常见的聚类对齐度量来检查社区结构与主题之间的对齐,并引入了一种基于少数社区内特定主题集中度的对齐统计度量。展示了主题以及主题与社区之间对齐的可视化,以支持在公共卫生传播背景下对结果的解释以及识别有拒绝HPV疫苗安全性和有效性风险的社区。

结果

我们分析了来自101519名用户的285417条关于HPV疫苗的推特帖子(推文),这些用户通过4387524个社交联系相连。检查社区结构与推文主题之间的对齐,结果表明鲁汶社区检测算法与DMM一起产生的对齐值始终更高,并且当主题数量较少时对齐通常更高。在应用具有30个主题的鲁汶方法和DMM并在层次结构中对语义相似的主题进行分组后,我们将163148条(57.16%)推文表征为证据和宣传,6244条(2.19%)推文描述个人经历。在发布经验性推文的4548名用户中,3449名用户(75.84%)出现在大多数推文是关于证据和宣传的社区中。

结论

将社区检测与主题建模结合使用似乎是一种在公共卫生应用中进行观点监测时表征推特社区的有用方法。我们的方法可能有助于识别有受到关于HPV疫苗等公共卫生干预措施的负面意见影响风险的在线社区。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5dd/5020315/2c7616f718e0/jmir_v18i8e232_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验