• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

如何评估针对 Twitter 时间序数据的情感分类器?

How to evaluate sentiment classifiers for Twitter time-ordered data?

机构信息

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia.

INESC TEC, Porto, Portugal.

出版信息

PLoS One. 2018 Mar 13;13(3):e0194317. doi: 10.1371/journal.pone.0194317. eCollection 2018.

DOI:10.1371/journal.pone.0194317
PMID:29534112
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5849349/
Abstract

Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc. In this paper we focus on sentiment classification of Twitter data. Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so. Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data. The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters. We collected a large set of 1.5 million tweets in 13 European languages. We created 138 sentiment models and out-of-sample datasets, which are used as a gold standard for evaluations. The corresponding 138 in-sample datasets are used to empirically compare six different estimation procedures: three variants of cross-validation, and three variants of sequential validation (where test set always follows the training set). We find no significant difference between the best cross-validation and sequential validation. However, we observe that all cross-validation variants tend to overestimate the performance, while the sequential methods tend to underestimate it. Standard cross-validation with random selection of examples is significantly worse than the blocked cross-validation, and should not be used to evaluate classifiers in time-ordered data scenarios.

摘要

社交媒体正成为公众对选举、英国脱欧、股票市场等问题情绪的重要信息来源。本文主要关注的是推特数据的情感分类。情感分类器的构建是标准的文本挖掘任务,但在这里,我们要解决的问题是如何正确地对其进行评估,因为目前还没有确定的方法。情感类别是有序且不平衡的,而推特会产生一连串按时间排序的数据。我们要解决的问题涉及到获得性能指标可靠估计的程序,以及训练数据和测试数据的时间顺序是否重要。我们收集了 13 种语言的 150 万条推文。创建了 138 个情感模型和样本外数据集,这些数据集被用作评估的黄金标准。相应的 138 个样本内数据集用于经验比较六种不同的估计程序:交叉验证的三种变体和顺序验证的三种变体(其中测试集始终紧跟训练集)。我们发现最佳交叉验证和顺序验证之间没有显著差异。但是,我们观察到所有的交叉验证变体都倾向于高估性能,而顺序方法则倾向于低估性能。随机选择示例的标准交叉验证明显不如分块交叉验证好,因此不应该用于评估有序时间数据场景中的分类器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/0143866b51b5/pone.0194317.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/0bb354666e7c/pone.0194317.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/4c2bcbb047ca/pone.0194317.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/84d96c2fa91c/pone.0194317.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/662be3be49f5/pone.0194317.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/48aea962b2d3/pone.0194317.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/88c1e0490fdc/pone.0194317.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/b6dedf46cb1b/pone.0194317.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/2daf323e04fa/pone.0194317.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/3fc723e5743f/pone.0194317.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/119744323205/pone.0194317.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/0143866b51b5/pone.0194317.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/0bb354666e7c/pone.0194317.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/4c2bcbb047ca/pone.0194317.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/84d96c2fa91c/pone.0194317.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/662be3be49f5/pone.0194317.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/48aea962b2d3/pone.0194317.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/88c1e0490fdc/pone.0194317.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/b6dedf46cb1b/pone.0194317.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/2daf323e04fa/pone.0194317.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/3fc723e5743f/pone.0194317.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/119744323205/pone.0194317.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9020/5849349/0143866b51b5/pone.0194317.g011.jpg

相似文献

1
How to evaluate sentiment classifiers for Twitter time-ordered data?如何评估针对 Twitter 时间序数据的情感分类器?
PLoS One. 2018 Mar 13;13(3):e0194317. doi: 10.1371/journal.pone.0194317. eCollection 2018.
2
Using social connection information to improve opinion mining: Identifying negative sentiment about HPV vaccines on Twitter.利用社交关系信息改进观点挖掘:识别推特上关于人乳头瘤病毒疫苗的负面情绪。
Stud Health Technol Inform. 2015;216:761-5.
3
Leveraging machine learning-based approaches to assess human papillomavirus vaccination sentiment trends with Twitter data.利用基于机器学习的方法,利用 Twitter 数据评估人乳头瘤病毒疫苗接种情绪趋势。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):69. doi: 10.1186/s12911-017-0469-6.
4
Optimization on machine learning based approaches for sentiment analysis on HPV vaccines related tweets.基于机器学习方法对HPV疫苗相关推文进行情感分析的优化。
J Biomed Semantics. 2017 Mar 3;8(1):9. doi: 10.1186/s13326-017-0120-6.
5
Sentiment Analysis of Shared Tweets on Global Warming on Twitter with Data Mining Methods: A Case Study on Turkish Language.使用数据挖掘方法对 Twitter 上有关全球变暖的共享推文进行情感分析:以土耳其语为例。
Comput Intell Neurosci. 2020 Sep 7;2020:1904172. doi: 10.1155/2020/1904172. eCollection 2020.
6
"When 'Bad' is 'Good'": Identifying Personal Communication and Sentiment in Drug-Related Tweets.当“负面”即“正面”:识别与毒品相关推文中的个人交流和情感倾向
JMIR Public Health Surveill. 2016 Oct 24;2(2):e162. doi: 10.2196/publichealth.6327.
7
An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.一种用于在社交媒体消息中发现健康相关知识的集成异构分类方法。
J Biomed Inform. 2014 Jun;49:255-68. doi: 10.1016/j.jbi.2014.03.005. Epub 2014 Mar 16.
8
Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers.使用机器学习分类器对推特上的金融推特帖子进行情感分析。
Heliyon. 2023 Dec 17;10(1):e23784. doi: 10.1016/j.heliyon.2023.e23784. eCollection 2024 Jan 15.
9
Using Twitter to Better Understand the Spatiotemporal Patterns of Public Sentiment: A Case Study in Massachusetts, USA.利用 Twitter 更好地了解公众情绪的时空模式:以美国马萨诸塞州为例。
Int J Environ Res Public Health. 2018 Feb 2;15(2):250. doi: 10.3390/ijerph15020250.
10
Sentiment analysis on smoking in social networks.社交网络中关于吸烟的情感分析。
Stud Health Technol Inform. 2013;192:1118.

引用本文的文献

1
Retweet communities reveal the main sources of hate speech.转发社区揭示了仇恨言论的主要来源。
PLoS One. 2022 Mar 17;17(3):e0265602. doi: 10.1371/journal.pone.0265602. eCollection 2022.
2
Evolution of topics and hate speech in retweet network communities.转发网络社区中话题与仇恨言论的演变
Appl Netw Sci. 2021;6(1):96. doi: 10.1007/s41109-021-00439-7. Epub 2021 Dec 20.
3
Video kills the sentiment-Exploring fans' reception of the video assistant referee in the English premier league using Twitter data.视频扼杀情感——利用推特数据探究英超球迷对视频助理裁判的接受度。

本文引用的文献

1
Multilingual Twitter Sentiment Classification: The Role of Human Annotators.多语言推特情感分类:人工标注者的作用。
PLoS One. 2016 May 5;11(5):e0155036. doi: 10.1371/journal.pone.0155036. eCollection 2016.
PLoS One. 2020 Dec 9;15(12):e0242728. doi: 10.1371/journal.pone.0242728. eCollection 2020.
4
Interdisciplinary optimism? Sentiment analysis of Twitter data.跨学科的乐观主义?推特数据的情感分析。
R Soc Open Sci. 2019 Jul 31;6(7):190473. doi: 10.1098/rsos.190473. eCollection 2019 Jul.