• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

电子健康论坛中的情感分析的特征工程。

Feature engineering for sentiment analysis in e-health forums.

机构信息

UNED IR & NLP Group, Madrid, Spain.

出版信息

PLoS One. 2018 Nov 29;13(11):e0207996. doi: 10.1371/journal.pone.0207996. eCollection 2018.

DOI:10.1371/journal.pone.0207996
PMID:30496232
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6264154/
Abstract

INTRODUCTION

Exploiting information in health-related social media services is of great interest for patients, researchers and medical companies. The challenge is, however, to provide easy, quick and relevant access to the vast amount of information that is available. One step towards facilitating information access to online health data is opinion mining. Even though the classification of patient opinions into positive and negative has been previously tackled, most works make use of machine learning methods and bags of words. Our first contribution is an extensive evaluation of different features, including lexical, syntactic, semantic, network-based, sentiment-based and word embeddings features to represent patient-authored texts for polarity classification. The second contribution of this work is the study of polar facts (i.e. objective information with polar connotations). Traditionally, the presence of polar facts has been neglected and research in polarity classification has been bounded to opinionated texts. We demonstrate the existence and importance of polar facts for the polarity classification of health information.

MATERIAL AND METHODS

We annotate a set of more than 3500 posts to online health forums of breast cancer, crohn and different allergies, respectively. Each sentence in a post is manually labeled as "experience", "fact" or "opinion", and as "positive", "negative" and "neutral". Using this data, we train different machine learning algorithms and compare traditional bags of words representations with word embeddings in combination with lexical, syntactic, semantic, network-based and emotional properties of texts to automatically classify patient-authored contents into positive, negative and neutral. Beside, we experiment with a combination of textual and semantic representations by generating concept embeddings using the UMLS Metathesaurus.

RESULTS

We reach two main results: first, we find that it is possible to predict polarity of patient-authored contents with a very high accuracy (≈ 70 percent) using word embeddings, and that this considerably outperforms more traditional representations like bags of words; and second, when dealing with medical information, negative and positive facts (i.e. objective information) are nearly as frequent as negative and positive opinions and experiences (i.e. subjective information), and their importance for polarity classification is crucial.

摘要

简介

利用与健康相关的社交媒体服务中的信息对患者、研究人员和医疗公司来说非常有意义。然而,挑战在于为可用的大量信息提供简便、快速和相关的访问途径。促进在线健康数据信息访问的一个步骤是意见挖掘。尽管已经对将患者意见分类为积极和消极进行了研究,但大多数工作都利用机器学习方法和词袋。我们的第一个贡献是对不同特征(包括词汇、句法、语义、基于网络、基于情感和单词嵌入特征)进行广泛评估,以表示患者撰写的文本进行极性分类。这项工作的第二个贡献是研究极性事实(即具有极性内涵的客观信息)。传统上,忽略了极性事实的存在,并且极性分类研究仅限于有意见的文本。我们证明了极性事实对于健康信息极性分类的存在和重要性。

材料与方法

我们分别对来自在线乳腺癌、克罗恩病和不同过敏症的健康论坛的 3500 多个帖子进行了标注。帖子中的每个句子都被手动标记为“经验”、“事实”或“意见”,以及“积极”、“消极”和“中立”。使用此数据,我们训练了不同的机器学习算法,并将传统的词袋表示与单词嵌入相结合,结合文本的词汇、句法、语义、基于网络和情感属性,以自动将患者撰写的内容分类为积极、消极和中立。此外,我们通过使用 UMLS Metathesaurus 生成概念嵌入来尝试文本和语义表示的组合。

结果

我们得出了两个主要结果:首先,我们发现使用单词嵌入可以非常准确地预测(≈70%)患者撰写的内容的极性,并且这明显优于更传统的表示形式,如词袋;其次,在处理医疗信息时,负面和正面事实(即客观信息)几乎与负面和正面意见和经验(即主观信息)一样频繁,并且它们对极性分类至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47b3/6264154/cd4094836f16/pone.0207996.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47b3/6264154/cd4094836f16/pone.0207996.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47b3/6264154/cd4094836f16/pone.0207996.g001.jpg

相似文献

1
Feature engineering for sentiment analysis in e-health forums.电子健康论坛中的情感分析的特征工程。
PLoS One. 2018 Nov 29;13(11):e0207996. doi: 10.1371/journal.pone.0207996. eCollection 2018.
2
Beyond opinion classification: Extracting facts, opinions and experiences from health forums.超越观点分类:从健康论坛中提取事实、观点和经验。
PLoS One. 2019 Jan 9;14(1):e0209961. doi: 10.1371/journal.pone.0209961. eCollection 2019.
3
Using Linked Data for polarity classification of patients' experiences.利用关联数据进行患者体验的极性分类。
J Biomed Inform. 2015 Oct;57:6-19. doi: 10.1016/j.jbi.2015.06.017. Epub 2015 Jul 23.
4
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
5
Understanding Mental Health Issues in Different Subdomains of Social Networking Services: Computational Analysis of Text-Based Reddit Posts.理解不同社交网络服务领域的心理健康问题:基于文本的 Reddit 帖子的计算分析。
J Med Internet Res. 2023 Nov 30;25:e49074. doi: 10.2196/49074.
6
Vaccine sentiment analysis using BERT + NBSVM and geo-spatial approaches.使用BERT + NBSVM和地理空间方法的疫苗情绪分析。
J Supercomput. 2023 May 7:1-31. doi: 10.1007/s11227-023-05319-8.
7
PREDOSE: a semantic web platform for drug abuse epidemiology using social media.前置:一个利用社交媒体进行药物滥用流行病学研究的语义网平台。
J Biomed Inform. 2013 Dec;46(6):985-97. doi: 10.1016/j.jbi.2013.07.007. Epub 2013 Jul 25.
8
An Interpretable Classification Framework for Information Extraction from Online Healthcare Forums.从在线医疗保健论坛中提取信息的可解释分类框架。
J Healthc Eng. 2017;2017:2460174. doi: 10.1155/2017/2460174. Epub 2017 Aug 3.
9
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.基于组合分类方法和 Senti-lexicon 算法的马来语情感分析。
PLoS One. 2018 Apr 23;13(4):e0194852. doi: 10.1371/journal.pone.0194852. eCollection 2018.
10
Sentiment Analysis of Animated Film Reviews Using Intelligent Machine Learning.使用智能机器学习进行动画电影评论的情感分析。
Comput Intell Neurosci. 2022 Jul 20;2022:8517205. doi: 10.1155/2022/8517205. eCollection 2022.

引用本文的文献

1
Differences in Fear and Negativity Levels Between Formal and Informal Health-Related Websites: Analysis of Sentiments and Emotions.正式与非正式健康相关网站之间的恐惧和消极水平差异:情感分析。
J Med Internet Res. 2024 Aug 9;26:e55151. doi: 10.2196/55151.
2
Feature engineering of environmental covariates improves plant genomic-enabled prediction.环境协变量的特征工程改进了基于植物基因组的预测。
Front Plant Sci. 2024 May 15;15:1349569. doi: 10.3389/fpls.2024.1349569. eCollection 2024.
3
Examining a sentiment algorithm on session patient records in an eating disorder treatment setting: a preliminary study.

本文引用的文献

1
Prescription extraction using CRFs and word embeddings.使用条件随机场和词嵌入进行处方提取。
J Biomed Inform. 2017 Aug;72:60-66. doi: 10.1016/j.jbi.2017.07.002. Epub 2017 Jul 4.
2
Doctor AI: Predicting Clinical Events via Recurrent Neural Networks.人工智能医生:通过循环神经网络预测临床事件
JMLR Workshop Conf Proc. 2016 Aug;56:301-318. Epub 2016 Dec 10.
3
Bidirectional RNN for Medical Event Detection in Electronic Health Records.用于电子健康记录中医疗事件检测的双向循环神经网络
在饮食失调治疗环境中,对会话患者记录的情感算法进行检验:一项初步研究。
Front Psychiatry. 2024 Mar 13;15:1275236. doi: 10.3389/fpsyt.2024.1275236. eCollection 2024.
4
Multiple-Perspective Data-Driven Analysis of Online Health Communities.在线健康社区的多视角数据驱动分析
Healthcare (Basel). 2023 Oct 12;11(20):2723. doi: 10.3390/healthcare11202723.
5
The emotional side of taking part in a cancer clinical trial.参与癌症临床试验的情感层面。
PLoS One. 2023 Apr 24;18(4):e0284268. doi: 10.1371/journal.pone.0284268. eCollection 2023.
6
Comparison of Pretraining Models and Strategies for Health-Related Social Media Text Classification.与健康相关的社交媒体文本分类的预训练模型和策略比较。
Healthcare (Basel). 2022 Aug 5;10(8):1478. doi: 10.3390/healthcare10081478.
7
Unlink the Link Between COVID-19 and 5G Networks: An NLP and SNA Based Approach.切断新冠病毒与5G网络之间的联系:一种基于自然语言处理和社会网络分析的方法。
IEEE Access. 2020 Nov 18;8:209127-209137. doi: 10.1109/ACCESS.2020.3039168. eCollection 2020.
8
Deep associative learning approach for bio-medical sentiment analysis utilizing unsupervised representation from large-scale patients' narratives.利用大规模患者叙述中的无监督表示进行生物医学情感分析的深度关联学习方法。
Pers Ubiquitous Comput. 2021 Aug 11:1-15. doi: 10.1007/s00779-021-01595-4.
9
Artificial Intelligence for Understanding Imaging, Text, and Data in Gastroenterology.用于理解胃肠病学中影像、文本和数据的人工智能
Gastroenterol Hepatol (N Y). 2020 Jul;16(7):341-349.
10
Determination of Patient Sentiment and Emotion in Ophthalmology: Infoveillance Tutorial on Web-Based Health Forum Discussions.眼科患者情绪的测定:基于网络健康论坛讨论的 Infoveillance 教程。
J Med Internet Res. 2021 May 17;23(5):e20803. doi: 10.2196/20803.
Proc Conf. 2016 Jun;2016:473-482. doi: 10.18653/v1/n16-1056.
4
Multilingual Twitter Sentiment Classification: The Role of Human Annotators.多语言推特情感分类:人工标注者的作用。
PLoS One. 2016 May 5;11(5):e0155036. doi: 10.1371/journal.pone.0155036. eCollection 2016.
5
Sentiment analysis in medical settings: New opportunities and challenges.医疗环境中的情感分析:新的机遇和挑战。
Artif Intell Med. 2015 May;64(1):17-27. doi: 10.1016/j.artmed.2015.03.006. Epub 2015 May 1.
6
Feature engineering for MEDLINE citation categorization with MeSH.使用医学主题词表(MeSH)进行医学文献数据库(MEDLINE)引文分类的特征工程
BMC Bioinformatics. 2015 Apr 8;16:113. doi: 10.1186/s12859-015-0539-7.
7
Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.社交媒体中的药物警戒:使用带有词嵌入聚类特征的序列标注挖掘药物不良反应提及信息。
J Am Med Inform Assoc. 2015 May;22(3):671-81. doi: 10.1093/jamia/ocu041. Epub 2015 Mar 9.
8
Induced lexico-syntactic patterns improve information extraction from online medical forums.诱导词汇句法模式可提高从在线医疗论坛中提取信息的能力。
J Am Med Inform Assoc. 2014 Sep-Oct;21(5):902-9. doi: 10.1136/amiajnl-2014-002669. Epub 2014 Jun 26.
9
Self management for patients with chronic obstructive pulmonary disease.慢性阻塞性肺疾病患者的自我管理
Cochrane Database Syst Rev. 2014 Mar 19;2014(3):CD002990. doi: 10.1002/14651858.CD002990.pub3.
10
Health outcomes and related effects of using social media in chronic disease management: a literature review and analysis of affordances.社交媒体在慢性病管理中的使用对健康结果和相关影响:文献回顾与功能分析。
J Biomed Inform. 2013 Dec;46(6):957-69. doi: 10.1016/j.jbi.2013.04.010. Epub 2013 May 20.