透析护理中的患者心声：社交媒体话语的情感分析与主题建模研究

Patient Voices in Dialysis Care: Sentiment Analysis and Topic Modeling Study of Social Media Discourse.

作者信息

Shankar Ravi, Xu Qian, Bundele Anjali

机构信息

Medical Affairs - Research Innovation & Enterprise, Alexandra Hospital, Singapore, Singapore.

School of Civil, Aerospace and Design Engineering, University of Bristol, Bristol, United Kingdom.

出版信息

J Med Internet Res. 2025 May 15;27:e70128. doi: 10.2196/70128.

BACKGROUND

Patients with end-stage kidney disease undergoing dialysis face significant physical, psychological, and social challenges that impact their quality of life. Social media platforms such as X (formerly known as Twitter) have become important outlets for these patients to share experiences and exchange information.

OBJECTIVE

This study aimed to uncover key themes, emotions, and challenges expressed by the dialysis community on X from April 2006 to August 2024 by leveraging natural language processing techniques, specifically sentiment analysis and topic modeling.

METHODS

We collected 12,976 publicly available X posts related to dialysis using the platform's application programming interface version 2 and Python's Tweepy library. After rigorous preprocessing, 58.13% (7543/12,976) of the posts were retained for analysis. Sentiment analysis using the Valence Aware Dictionary and Sentiment Reasoner (VADER) model, which is a rule-based sentiment analyzer specifically attuned to social media content, classified the emotional tone of posts. VADER uses a human-curated lexicon that maps lexical features to sentiment scores, considering punctuation, capitalization, and modifiers. For topic modeling, posts with <50 tokens were removed, leaving 53.81% (4059/7543) of the posts, which were analyzed using latent Dirichlet allocation with coherence score optimization to identify the optimal number of topics (k=8). The analysis pipeline was implemented using Python's Natural Language Toolkit, Gensim, and scikit-learn libraries, with hyperparameter tuning to maximize model performance.

RESULTS

Sentiment analysis revealed 49.2% (3711/7543) positive, 26.2% (1976/7543) negative, and 24.7% (1863/7543) neutral sentiment posts. Latent Dirichlet allocation topic modeling identified 8 key thematic clusters: medical procedures and outcomes (722/4059, 17.8% prevalence), daily life impact (666/4059, 16.4%), risks and complications (621/4059, 15.3%), patient education and support (544/4059, 13.4%), health care access and costs (499/4059, 12.3%), symptoms and side effects (442/4059, 10.9%), patient experiences and socioeconomic challenges (406/4059, 10%), and diet and fluid management (162/4059, 4%). Cross-analysis of topics and sentiment revealed that negative sentiment was highest for daily life impact (580/666, 87.1%) and socioeconomic challenges (145/406, 35.8%), whereas the education and support topic exhibited more positive sentiment (250/544, 46%). Topic coherence scores ranged from 0.38 to 0.52, with the medical procedures topic showing the highest semantic coherence. Intertopic distance mapping via multidimensional scaling revealed conceptual relationships between identified themes, with lifestyle impact and socioeconomic challenges clustering closely. Our longitudinal analysis demonstrated evolving discourse patterns, with technology-related discussions increasing by 24% in recent years, whereas financial concerns remained consistently prominent.

CONCLUSIONS

This study provides a comprehensive, data-driven understanding of the complex lived experiences of patients undergoing dialysis shared on social media. The findings underscore the need for more holistic, patient-centered care models and policies that address the multidimensional challenges illuminated by patients' voices.

背景

接受透析的终末期肾病患者面临着重大的身体、心理和社会挑战，这些挑战会影响他们的生活质量。像X（前身为推特）这样的社交媒体平台已成为这些患者分享经历和交流信息的重要渠道。

目的

本研究旨在利用自然语言处理技术，特别是情感分析和主题建模，揭示2006年4月至2024年8月期间透析群体在X上表达的关键主题、情感和挑战。

方法

我们使用该平台的应用程序编程接口版本2和Python的Tweepy库收集了12976条与透析相关的公开可用X帖子。经过严格的预处理后，保留了58.13%（7543/12976）的帖子用于分析。使用价态感知词典和情感推理器（VADER）模型进行情感分析，该模型是一种基于规则的情感分析器，专门针对社交媒体内容进行了调整，对帖子的情感基调进行分类。VADER使用人工整理的词汇表，将词汇特征映射到情感分数，同时考虑标点、大写和修饰词。对于主题建模，删除了令牌数少于50的帖子，剩下53.81%（4059/7543）的帖子，使用具有一致性分数优化的潜在狄利克雷分配对其进行分析，以确定最佳主题数量（k = 8）。分析流程使用Python的自然语言工具包、Gensim和scikit-learn库实现，并进行超参数调整以最大化模型性能。

结果

情感分析显示，正面情感帖子占49.2%（3711/7543），负面情感帖子占26.2%（1976/7543），中性情感帖子占24.7%（1863/7543）。潜在狄利克雷分配主题建模确定了8个关键主题集群：医疗程序和结果（722/4059，占比17.8%）、日常生活影响（666/4059，占比16.4%）、风险和并发症（621/4059，占比15.3%）、患者教育和支持（544/4059，占比13.4%）、医疗保健获取和成本（499/4059，占比12.3%）、症状和副作用（442/4059，占比10.9%）以及患者经历和社会经济挑战（406/4059，占比10%）、饮食和液体管理（162/4059，占比4%）。主题与情感的交叉分析显示，日常生活影响（580/666，占比87.1%）和社会经济挑战（145/406，占比35.8%）的负面情感最高，而教育和支持主题的正面情感更多（250/544，占比46%）。主题一致性分数在0.38至0.52之间，医疗程序主题的语义一致性最高。通过多维缩放进行的主题间距离映射揭示了已识别主题之间的概念关系，生活方式影响和社会经济挑战紧密聚类。我们的纵向分析表明话语模式在不断演变，近年来与技术相关的讨论增加了24%，而财务问题一直很突出。

结论

本研究提供了对社交媒体上透析患者复杂生活经历的全面、数据驱动的理解。研究结果强调需要更全面、以患者为中心的护理模式和政策，以应对患者声音所揭示的多维度挑战。

Patient Voices in Dialysis Care: Sentiment Analysis and Topic Modeling Study of Social Media Discourse.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献