通过词嵌入和 LSTM 神经网络识别个人健康体验的推文。

Identifying tweets of personal health experience through word embedding and LSTM neural network.

机构信息

Department of Computer Information Technology and Graphics, Purdue University Northwest, Hammond, IN, USA.

Department of Medicine, Vanderbilt University, Nashville, TN, USA.

出版信息

BMC Bioinformatics. 2018 Jun 13;19(Suppl 8):210. doi: 10.1186/s12859-018-2198-y.

DOI:10.1186/s12859-018-2198-y

PMID:29897323

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5998756/

Abstract

BACKGROUND

As Twitter has become an active data source for health surveillance research, it is important that efficient and effective methods are developed to identify tweets related to personal health experience. Conventional classification algorithms rely on features engineered by human domain experts, and engineering such features is a challenging task and requires much human intelligence. The resultant features may not be optimal for the classification problem, and can make it challenging for conventional classifiers to correctly predict personal experience tweets (PETs) due to the various ways to express and/or describe personal experience in tweets. In this study, we developed a method that combines word embedding and long short-term memory (LSTM) model without the need to engineer any specific features. Through word embedding, tweet texts were represented as dense vectors which in turn were fed to the LSTM neural network as sequences.

RESULTS

Statistical analyses of the results of 10-fold cross-validations of our method and conventional methods indicate that there exist significant differences (p < 0.01) in performance measures of accuracy, precision, recall, F1-score, and ROC/AUC, demonstrating that our approach outperforms the conventional methods in identifying PETs.

CONCLUSION

We presented an efficient and effective method of identifying health-related personal experience tweets by combining word embedding and an LSTM neural network. It is conceivable that our method can help accelerate and scale up analyzing textual data of social media for health surveillance purposes, because of no need for the laborious and costly process of engineering features.

摘要

背景

随着 Twitter 成为健康监测研究的活跃数据源，开发有效的方法来识别与个人健康体验相关的推文变得尤为重要。传统的分类算法依赖于人类领域专家设计的特征，而设计这些特征是一项具有挑战性的任务，需要大量的人类智慧。由此产生的特征可能不是分类问题的最佳选择，由于在推文中表达和/或描述个人体验的方式多种多样，这使得传统分类器难以正确预测个人体验推文 (PETs)。在这项研究中，我们开发了一种结合词嵌入和长短期记忆 (LSTM) 模型的方法，无需设计任何特定特征。通过词嵌入，推文文本被表示为密集向量，然后作为序列输入到 LSTM 神经网络中。

结果

对我们的方法和传统方法的 10 折交叉验证结果的统计分析表明，在准确性、精度、召回率、F1 得分和 ROC/AUC 等性能指标上存在显著差异 (p < 0.01)，这表明我们的方法在识别 PETs 方面优于传统方法。

结论

我们通过结合词嵌入和 LSTM 神经网络，提出了一种有效识别与健康相关的个人体验推文的方法。可以想象，由于无需进行繁琐且昂贵的特征设计过程，我们的方法可以帮助加速和扩大社交媒体文本数据的分析，用于健康监测目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51fe/5998756/9c67fa952170/12859_2018_2198_Fig1_HTML.jpg

相似文献

Identifying tweets of personal health experience through word embedding and LSTM neural network.通过词嵌入和 LSTM 神经网络识别个人健康体验的推文。

BMC Bioinformatics. 2018 Jun 13;19(Suppl 8):210. doi: 10.1186/s12859-018-2198-y.

Identifying health related occupations of Twitter users through word embedding and deep neural networks.通过词嵌入和深度神经网络识别 Twitter 用户的健康相关职业。

BMC Bioinformatics. 2022 Sep 28;22(Suppl 10):630. doi: 10.1186/s12859-022-04933-2.

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.机器学习分类器在电子烟 Twitter 监测中的应用：比较机器学习研究。

J Med Internet Res. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478.

Identifying personal health experience tweets with deep neural networks.使用深度神经网络识别个人健康体验推文。

Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:1174-1177. doi: 10.1109/EMBC.2017.8037039.

Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach.基于本体的推特消息中医疗命名实体识别的递归神经网络方法。

Int J Environ Res Public Health. 2019 Sep 27;16(19):3628. doi: 10.3390/ijerph16193628.

Extraction of Medication-Effect Relations in Twitter Data with Neural Embedding and Recurrent Neural Network.利用神经嵌入和递归神经网络从 Twitter 数据中提取药物效应关系

Stud Health Technol Inform. 2022 Jun 6;290:767-771. doi: 10.3233/SHTI220182.

Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models.2015年麻疹疫情期间推文的公众认知分析：使用卷积神经网络模型的比较研究

J Med Internet Res. 2018 Jul 9;20(7):e236. doi: 10.2196/jmir.9413.

Mining e-cigarette adverse events in social media using Bi-LSTM recurrent neural network with word embedding representation.利用带有词嵌入表示的 Bi-LSTM 递归神经网络挖掘社交媒体中的电子烟不良事件。

J Am Med Inform Assoc. 2018 Jan 1;25(1):72-80. doi: 10.1093/jamia/ocx045.

Identifying Patients With Inflammatory Bowel Disease on Twitter and Learning From Their Personal Experience: Retrospective Cohort Study.在 Twitter 上识别炎症性肠病患者并从他们的个人经验中学习：回顾性队列研究。

J Med Internet Res. 2022 Aug 2;24(8):e29186. doi: 10.2196/29186.

Prediction of Personal Experience Tweets of Medication Use via Contextual Word Representations.通过上下文词表示预测用药的个人体验推文

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6093-6096. doi: 10.1109/EMBC.2019.8856753.

引用本文的文献

Using transformer-based models and social media posts for heat stroke detection.使用基于Transformer的模型和社交媒体帖子进行中暑检测。

Sci Rep. 2025 Jan 4;15(1):742. doi: 10.1038/s41598-024-84992-y.

Internet-based surveillance to track trends in seasonal allergies across the United States.基于互联网的监测，以追踪全美国季节性过敏的趋势。

PNAS Nexus. 2024 Oct 29;3(10):pgae430. doi: 10.1093/pnasnexus/pgae430. eCollection 2024 Oct.

Topics and Sentiment Surrounding Vaping on Twitter and Reddit During the 2019 e-Cigarette and Vaping Use-Associated Lung Injury Outbreak: Comparative Study.主题和情绪围绕着 2019 年电子烟和蒸气相关肺损伤爆发期间 Twitter 和 Reddit 上的蒸气：比较研究。

J Med Internet Res. 2022 Dec 13;24(12):e39460. doi: 10.2196/39460.

COVID-19 personal health mention detection from tweets using dual convolutional neural network.使用双卷积神经网络从推文检测COVID-19个人健康提及情况。

Expert Syst Appl. 2022 Aug 15;200:117139. doi: 10.1016/j.eswa.2022.117139. Epub 2022 Apr 2.

Automated Detection of Vaping-Related Tweets on Twitter During the 2019 EVALI Outbreak Using Machine Learning Classification.在2019年电子烟相关肺损伤（EVALI）爆发期间，利用机器学习分类法在推特上自动检测与电子烟相关的推文。

Front Big Data. 2022 Feb 10;5:770585. doi: 10.3389/fdata.2022.770585. eCollection 2022.

An investigation into the deep learning approach in sentimental analysis using graph-based theories.基于图论的深度学习在情感分析中的应用研究。

PLoS One. 2021 Dec 2;16(12):e0260761. doi: 10.1371/journal.pone.0260761. eCollection 2021.

Classifying patient and professional voice in social media health posts.社交媒体健康帖文中的患者和专业声音分类。

BMC Med Inform Decis Mak. 2021 Aug 18;21(1):244. doi: 10.1186/s12911-021-01577-9.

Twitter Health Surveillance (THS) System.推特健康监测（THS）系统

Proc IEEE Int Conf Big Data. 2018 Dec;2018:1647-1654. doi: 10.1109/BigData.2018.8622504. Epub 2019 Jan 24.

本文引用的文献

Systematic review of surveillance by social media platforms for illicit drug use.社交媒体平台监测非法药物使用的系统评价。

J Public Health (Oxf). 2017 Dec 1;39(4):763-776. doi: 10.1093/pubmed/fdx020.

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.通过区域嵌入实现文本分类的半监督卷积神经网络。

Adv Neural Inf Process Syst. 2015 Dec;28:919-927.

Cough event classification by pretrained deep neural network.基于预训练深度神经网络的咳嗽事件分类

BMC Med Inform Decis Mak. 2015;15 Suppl 4(Suppl 4):S2. doi: 10.1186/1472-6947-15-S4-S2. Epub 2015 Nov 25.

Using Social Media for Actionable Disease Surveillance and Outbreak Management: A Systematic Literature Review.利用社交媒体进行可操作的疾病监测与疫情管理：一项系统文献综述

PLoS One. 2015 Oct 5;10(10):e0139701. doi: 10.1371/journal.pone.0139701. eCollection 2015.

Systematic review on the prevalence, frequency and comparative value of adverse events data in social media.社交媒体中不良事件数据的患病率、发生频率及比较价值的系统评价。

Br J Clin Pharmacol. 2015 Oct;80(4):878-88. doi: 10.1111/bcp.12746. Epub 2015 Sep 16.

The Canary in the Coal Mine Tweets: Social Media Reveals Public Perceptions of Non-Medical Use of Opioids.煤矿里的金丝雀推特：社交媒体揭示公众对阿片类药物非医疗用途的看法。

PLoS One. 2015 Aug 7;10(8):e0135072. doi: 10.1371/journal.pone.0135072. eCollection 2015.

The role of social media in reducing stigma and discrimination.社交媒体在减少污名化和歧视方面的作用。

Br J Psychiatry. 2015 Jun;206(6):443-4. doi: 10.1192/bjp.bp.114.152835.

Pharmacovigilance on twitter? Mining tweets for adverse drug reactions.推特上的药物警戒？挖掘推文以获取药品不良反应信息。

AMIA Annu Symp Proc. 2014 Nov 14;2014:924-33. eCollection 2014.

Utilizing social media data for pharmacovigilance: A review.利用社交媒体数据进行药物警戒：综述

J Biomed Inform. 2015 Apr;54:202-12. doi: 10.1016/j.jbi.2015.02.004. Epub 2015 Feb 23.

Twitter improves influenza forecasting.推特可改善流感预测。

PLoS Curr. 2014 Oct 28;6:ecurrents.outbreaks.90b9ed0f59bae4ccaa683a39865d9117. doi: 10.1371/currents.outbreaks.90b9ed0f59bae4ccaa683a39865d9117.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过词嵌入和 LSTM 神经网络识别个人健康体验的推文。

Identifying tweets of personal health experience through word embedding and LSTM neural network.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献