Ramamoorthy Thilagavathi, Kulothungan Vaitheeswaran, Mappillairaju Bagavandas
School of Public Health, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India.
ICMR-National Centre for Disease Informatics and Research, Bengaluru, India.
Front Artif Intell. 2024 Feb 12;7:1329185. doi: 10.3389/frai.2024.1329185. eCollection 2024.
The utilization of social media presents a promising avenue for the prevention and management of diabetes. To effectively cater to the diabetes-related knowledge, support, and intervention needs of the community, it is imperative to attain a deeper understanding of the extent and content of discussions pertaining to this health issue. This study aims to assess and compare various topic modeling techniques to determine the most effective model for identifying the core themes in diabetes-related tweets, the sources responsible for disseminating this information, the reach of these themes, and the influential individuals within the Twitter community in India.
Twitter messages from India, dated between 7 November 2022 and 28 February 2023, were collected using the Twitter API. The unsupervised machine learning topic models, namely, Latent Dirichlet Allocation (LDA), non-negative matrix factorization (NMF), BERTopic, and Top2Vec, were compared, and the best-performing model was used to identify common diabetes-related topics. Influential users were identified through social network analysis.
The NMF model outperformed the LDA model, whereas BERTopic performed better than Top2Vec. Diabetes-related conversations revolved around eight topics, namely, promotion, management, drug and personal story, consequences, risk factors and research, raising awareness and providing support, diet, and opinion and lifestyle changes. The influential nodes identified were mainly health professionals and healthcare organizations.
The study identified important topics of discussion along with health professionals and healthcare organizations involved in sharing diabetes-related information with the public. Collaborations among influential healthcare organizations, health professionals, and the government can foster awareness and prevent noncommunicable diseases.
社交媒体的利用为糖尿病的预防和管理提供了一条有前景的途径。为了有效满足社区对糖尿病相关知识、支持和干预的需求,必须更深入地了解与这个健康问题相关的讨论范围和内容。本研究旨在评估和比较各种主题建模技术,以确定用于识别与糖尿病相关推文的核心主题、传播此信息的来源、这些主题的覆盖范围以及印度推特社区中有影响力的个人的最有效模型。
使用推特应用程序编程接口收集了2022年11月7日至2023年2月28日期间来自印度的推特消息。比较了无监督机器学习主题模型,即潜在狄利克雷分配(LDA)、非负矩阵分解(NMF)、BERTopic和Top2Vec,并使用表现最佳的模型来识别常见的糖尿病相关主题。通过社交网络分析确定有影响力的用户。
NMF模型优于LDA模型,而BERTopic的表现优于Top2Vec。与糖尿病相关的对话围绕八个主题展开,即推广、管理、药物和个人故事、后果、风险因素与研究、提高认识与提供支持、饮食以及意见和生活方式改变。确定的有影响力的节点主要是健康专业人员和医疗保健组织。
该研究确定了重要的讨论主题以及参与向公众分享糖尿病相关信息的健康专业人员和医疗保健组织。有影响力的医疗保健组织、健康专业人员和政府之间的合作可以提高认识并预防非传染性疾病。