在推特上探索饮食失调话题：机器学习方法。

Exploring Eating Disorder Topics on Twitter: Machine Learning Approach.

作者信息

Zhou Sicheng, Zhao Yunpeng, Bian Jiang, Haynos Ann F, Zhang Rui

机构信息

Institute for Health Informatics, University of Minnesota, Minneapolis, MN, United States.

Department of Health Outcomes & Biomedical Informatics, University of Florida, Gainsville, FL, United States.

出版信息

JMIR Med Inform. 2020 Oct 30;8(10):e18273. doi: 10.2196/18273.

DOI:10.2196/18273

PMID:33124997

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7665945/

Abstract

BACKGROUND

Eating disorders (EDs) are a group of mental illnesses that have an adverse effect on both mental and physical health. As social media platforms (eg, Twitter) have become an important data source for public health research, some studies have qualitatively explored the ways in which EDs are discussed on these platforms. Initial results suggest that such research offers a promising method for further understanding this group of diseases. Nevertheless, an efficient computational method is needed to further identify and analyze tweets relevant to EDs on a larger scale.

OBJECTIVE

This study aims to develop and validate a machine learning-based classifier to identify tweets related to EDs and to explore factors (ie, topics) related to EDs using a topic modeling method.

METHODS

We collected potential ED-relevant tweets using keywords from previous studies and annotated these tweets into different groups (ie, ED relevant vs irrelevant and then promotional information vs laypeople discussion). Several supervised machine learning methods, such as convolutional neural network (CNN), long short-term memory (LSTM), support vector machine, and naïve Bayes, were developed and evaluated using annotated data. We used the classifier with the best performance to identify ED-relevant tweets and applied a topic modeling method-Correlation Explanation (CorEx)-to analyze the content of the identified tweets. To validate these machine learning results, we also collected a cohort of ED-relevant tweets on the basis of manually curated rules.

RESULTS

A total of 123,977 tweets were collected during the set period. We randomly annotated 2219 tweets for developing the machine learning classifiers. We developed a CNN-LSTM classifier to identify ED-relevant tweets published by laypeople in 2 steps: first relevant versus irrelevant (F score=0.89) and then promotional versus published by laypeople (F score=0.90). A total of 40,790 ED-relevant tweets were identified using the CNN-LSTM classifier. We also identified another set of tweets (ie, 17,632 ED-relevant and 83,557 ED-irrelevant tweets) posted by laypeople using manually specified rules. Using CorEx on all ED-relevant tweets, the topic model identified 162 topics. Overall, the coherence rate for topic modeling was 77.07% (1264/1640), indicating a high quality of the produced topics. The topics were further reviewed and analyzed by a domain expert.

CONCLUSIONS

A developed CNN-LSTM classifier could improve the efficiency of identifying ED-relevant tweets compared with the traditional manual-based method. The CorEx topic model was applied on the tweets identified by the machine learning-based classifier and the traditional manual approach separately. Highly overlapping topics were observed between the 2 cohorts of tweets. The produced topics were further reviewed by a domain expert. Some of the topics identified by the potential ED tweets may provide new avenues for understanding this serious set of disorders.

摘要

背景

饮食失调是一组对身心健康都有不利影响的精神疾病。随着社交媒体平台（如推特）成为公共卫生研究的重要数据源，一些研究已定性探索了在这些平台上讨论饮食失调的方式。初步结果表明，此类研究为进一步了解这组疾病提供了一种很有前景的方法。然而，需要一种有效的计算方法来在更大规模上进一步识别和分析与饮食失调相关的推文。

目的

本研究旨在开发并验证一种基于机器学习的分类器，以识别与饮食失调相关的推文，并使用主题建模方法探索与饮食失调相关的因素（即主题）。

方法

我们使用先前研究中的关键词收集了潜在的与饮食失调相关的推文，并将这些推文标注为不同类别（即与饮食失调相关 vs 不相关，然后是促销信息 vs 普通人讨论）。使用标注数据开发并评估了几种监督式机器学习方法，如卷积神经网络（CNN）、长短期记忆网络（LSTM）、支持向量机和朴素贝叶斯。我们使用性能最佳的分类器来识别与饮食失调相关的推文，并应用一种主题建模方法——相关性解释（CorEx）——来分析所识别推文的内容。为了验证这些机器学习结果，我们还根据人工制定的规则收集了一组与饮食失调相关的推文。

结果

在设定时间段内共收集到123,977条推文。我们随机标注了2219条推文用于开发机器学习分类器。我们开发了一种CNN - LSTM分类器，分两步识别普通人发布的与饮食失调相关的推文：首先是相关与不相关（F值 = 0.89），然后是促销与普通人发布（F值 = 0.90）。使用CNN - LSTM分类器共识别出40,790条与饮食失调相关的推文。我们还使用人工指定规则识别出了另一组由普通人发布的推文（即17,632条与饮食失调相关的推文和83,557条与饮食失调不相关的推文）。对所有与饮食失调相关的推文应用CorEx，主题模型识别出162个主题。总体而言，主题建模的连贯率为77.07%（1264/1640），表明所生成主题的质量较高。领域专家对这些主题进行了进一步审查和分析。

结论

与传统的基于人工的方法相比，开发的CNN - LSTM分类器可以提高识别与饮食失调相关推文的效率。CorEx主题模型分别应用于基于机器学习的分类器和传统人工方法识别出的推文。在这两组推文之间观察到高度重叠的主题。领域专家对所生成的主题进行了进一步审查。潜在的与饮食失调相关的推文所识别出的一些主题可能为理解这组严重疾病提供新途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7cd3/7665945/d583b857b243/medinform_v8i10e18273_fig1.jpg

相似文献

Exploring Eating Disorder Topics on Twitter: Machine Learning Approach.

JMIR Med Inform. 2020 Oct 30;8(10):e18273. doi: 10.2196/18273.

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.

J Med Internet Res. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478.

Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.

J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609.

Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models.

J Med Internet Res. 2018 Jul 9;20(7):e236. doi: 10.2196/jmir.9413.

Characterizing the Discussion of Antibiotics in the Twittersphere: What is the Bigger Picture?

J Med Internet Res. 2015 Jun 19;17(6):e154. doi: 10.2196/jmir.4220.

Analysis of Twitter to Identify Topics Related to Eating Disorder Symptoms.

Proc (IEEE Int Conf Healthc Inform). 2019 Jun;2019. doi: 10.1109/ichi.2019.8904863. Epub 2019 Nov 21.

Developing an Automatic System for Classifying Chatter About Health Services on Twitter: Case Study for Medicaid.

J Med Internet Res. 2021 May 3;23(5):e26616. doi: 10.2196/26616.

"When 'Bad' is 'Good'": Identifying Personal Communication and Sentiment in Drug-Related Tweets.

JMIR Public Health Surveill. 2016 Oct 24;2(2):e162. doi: 10.2196/publichealth.6327.

Classification of Twitter Vaping Discourse Using BERTweet: Comparative Deep Learning Study.

JMIR Med Inform. 2022 Jul 21;10(7):e33678. doi: 10.2196/33678.

Identifying Key Topics Bearing Negative Sentiment on Twitter: Insights Concerning the 2015-2016 Zika Epidemic.

JMIR Public Health Surveill. 2019 Jun 4;5(2):e11036. doi: 10.2196/11036.

引用本文的文献

Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study.

Npj Ment Health Res. 2024 Dec 6;3(1):62. doi: 10.1038/s44184-024-00107-5.

A Typology of Social Media Use by Human Service Nonprofits: Mixed Methods Study.

J Med Internet Res. 2024 May 8;26:e51698. doi: 10.2196/51698.

Investigating machine learning and natural language processing techniques applied for detecting eating disorders: a systematic literature review.

Front Psychiatry. 2024 Mar 26;15:1319522. doi: 10.3389/fpsyt.2024.1319522. eCollection 2024.

Topic modeling and social network analysis approach to explore diabetes discourse on Twitter in India.

Front Artif Intell. 2024 Feb 12;7:1329185. doi: 10.3389/frai.2024.1329185. eCollection 2024.

Investigating the Role of Nutrition in Enhancing Immunity During the COVID-19 Pandemic: Twitter Text-Mining Analysis.

J Med Internet Res. 2023 Jul 10;25:e47328. doi: 10.2196/47328.

Semi-automated Clinical Content Curation of COVID-19 Chatbot Remote Patient Monitoring Solution.

AMIA Annu Symp Proc. 2023 Apr 29;2022:756-765. eCollection 2022.

Assessment of Accuracy, User Engagement, and Themes of Eating Disorder Content in Social Media Short Videos.

JAMA Netw Open. 2023 Apr 3;6(4):e238897. doi: 10.1001/jamanetworkopen.2023.8897.

Discerning conversational context in online health communities for personalized digital behavior change solutions using Pragmatics to Reveal Intent in Social Media (PRISM) framework.

J Biomed Inform. 2023 Apr;140:104324. doi: 10.1016/j.jbi.2023.104324. Epub 2023 Feb 24.

Potential benefits and limitations of machine learning in the field of eating disorders: current research and future directions.

J Eat Disord. 2022 May 8;10(1):66. doi: 10.1186/s40337-022-00581-2.

Examining Public Sentiments and Attitudes Toward COVID-19 Vaccination: Infoveillance Study Using Twitter Posts.

JMIR Infodemiology. 2022 Apr 15;2(1):e33909. doi: 10.2196/33909. eCollection 2022 Jan-Jun.

本文引用的文献

Analysis of Twitter to Identify Topics Related to Eating Disorder Symptoms.

Proc (IEEE Int Conf Healthc Inform). 2019 Jun;2019. doi: 10.1109/ichi.2019.8904863. Epub 2019 Nov 21.

#recovery: Understanding recovery from the lens of recovery-focused blogs posted by individuals with lived experience.

Int J Eat Disord. 2020 Aug;53(8):1234-1243. doi: 10.1002/eat.23221. Epub 2019 Dec 30.

Mining Twitter to assess the determinants of health behavior toward human papillomavirus vaccination in the United States.

J Am Med Inform Assoc. 2020 Feb 1;27(2):225-235. doi: 10.1093/jamia/ocz191.

Eating Disorder Screening: a Systematic Review and Meta-analysis of Diagnostic Test Characteristics of the SCOFF.

J Gen Intern Med. 2020 Mar;35(3):885-893. doi: 10.1007/s11606-019-05478-6. Epub 2019 Nov 8.

Detecting associations between dietary supplement intake and sentiments within mental disorder tweets.

Health Informatics J. 2020 Jun;26(2):803-815. doi: 10.1177/1460458219867231. Epub 2019 Sep 30.

Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter.

Stud Health Technol Inform. 2019 Aug 21;264:1293-1297. doi: 10.3233/SHTI190435.

Assessing mental health signals among sexual and gender minorities using Twitter data.

Health Informatics J. 2020 Jun;26(2):765-786. doi: 10.1177/1460458219839621. Epub 2019 Apr 10.

"I just want to be skinny.": A content analysis of tweets expressing eating disorder symptoms.

PLoS One. 2019 Jan 16;14(1):e0207506. doi: 10.1371/journal.pone.0207506. eCollection 2019.

The unique effects of angry and depressive rumination on eating-disorder psychopathology and the mediating role of impulsivity.

Eat Behav. 2018 Apr;29:41-47. doi: 10.1016/j.eatbeh.2018.02.004. Epub 2018 Feb 17.

Towards Large-scale Twitter Mining for Drug-related Adverse Events.

SHB12 (2012). 2012 Oct 29;2012:25-32. doi: 10.1145/2389707.2389713.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在推特上探索饮食失调话题：机器学习方法。

Exploring Eating Disorder Topics on Twitter: Machine Learning Approach.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献