• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数的力量:利用大数据简化情感分类。

Strength in Numbers: Using Big Data to Simplify Sentiment Classification.

机构信息

1 IOMS Department, NYU Stern School of Business , New York, New York.

2 School of Business, Stevens Institute of Technology , Hoboken, New Jersey.

出版信息

Big Data. 2017 Sep;5(3):256-271.

PMID:28933941
Abstract

Sentiment classification, the task of assigning a positive or negative label to a text segment, is a key component of mainstream applications such as reputation monitoring, sentiment summarization, and item recommendation. Even though the performance of sentiment classification methods has steadily improved over time, their ever-increasing complexity renders them comprehensible by only a shrinking minority of expert practitioners. For all others, such highly complex methods are black-box predictors that are hard to tune and even harder to justify to decision makers. Motivated by these shortcomings, we introduce BigCounter: a new algorithm for sentiment classification that substitutes algorithmic complexity with Big Data. Our algorithm combines standard data structures with statistical testing to deliver accurate and interpretable predictions. It is also parameter free and suitable for use virtually "out of the box," which makes it appealing for organizations wanting to leverage their troves of unstructured data without incurring the significant expense of creating in-house teams of data scientists. Finally, BigCounter's efficient and parallelizable design makes it applicable to very large data sets. We apply our method on such data sets toward a study on the limits of Big Data for sentiment classification. Our study finds that, after a certain point, predictive performance tends to converge and additional data have little benefit. Our algorithmic design and findings provide the foundations for future research on the data-over-computation paradigm for classification problems.

摘要

情感分类,即将一个积极或消极的标签分配给一个文本段的任务,是主流应用程序(如声誉监测、情感总结和项目推荐)的关键组成部分。尽管情感分类方法的性能随着时间的推移而稳步提高,但它们日益复杂,只有越来越少的专家实践者能够理解。对于其他所有人来说,这些高度复杂的方法是黑盒预测器,难以调整,甚至更难以向决策者证明其合理性。出于这些缺点,我们引入了 BigCounter:一种用于情感分类的新算法,用大数据替代算法复杂度。我们的算法将标准数据结构与统计测试相结合,提供准确和可解释的预测。它还没有参数,几乎可以“开箱即用”,这对于希望利用其大量非结构化数据而又不想承担创建内部数据科学家团队的巨大费用的组织来说非常有吸引力。最后,BigCounter 的高效和可并行化设计使其适用于非常大的数据集。我们将我们的方法应用于这些数据集,以研究大数据在情感分类中的局限性。我们的研究发现,在达到一定程度后,预测性能趋于收敛,额外的数据几乎没有好处。我们的算法设计和发现为分类问题的数据优于计算范式的未来研究提供了基础。

相似文献

1
Strength in Numbers: Using Big Data to Simplify Sentiment Classification.数的力量:利用大数据简化情感分类。
Big Data. 2017 Sep;5(3):256-271.
2
Automatic Construction and Global Optimization of a Multisentiment Lexicon.多情感词典的自动构建与全局优化
Comput Intell Neurosci. 2016;2016:2093406. doi: 10.1155/2016/2093406. Epub 2016 Nov 29.
3
A global optimization approach to multi-polarity sentiment analysis.一种用于多极性情感分析的全局优化方法。
PLoS One. 2015 Apr 24;10(4):e0124672. doi: 10.1371/journal.pone.0124672. eCollection 2015.
4
Big Data Recommendation Research Based on Travel Consumer Sentiment Analysis.基于旅游消费者情感分析的大数据推荐研究
Front Psychol. 2022 Feb 28;13:857292. doi: 10.3389/fpsyg.2022.857292. eCollection 2022.
5
Using Linked Data for polarity classification of patients' experiences.利用关联数据进行患者体验的极性分类。
J Biomed Inform. 2015 Oct;57:6-19. doi: 10.1016/j.jbi.2015.06.017. Epub 2015 Jul 23.
6
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.基于组合分类方法和 Senti-lexicon 算法的马来语情感分析。
PLoS One. 2018 Apr 23;13(4):e0194852. doi: 10.1371/journal.pone.0194852. eCollection 2018.
7
The Scalable Fuzzy Inference-Based Ensemble Method for Sentiment Analysis.基于可扩展模糊推理的情感分析集成方法。
Comput Intell Neurosci. 2022 Sep 28;2022:5186144. doi: 10.1155/2022/5186144. eCollection 2022.
8
Rule-Based Arabic Sentiment Analysis using Binary Equilibrium Optimization Algorithm.基于规则的阿拉伯语情感分析:使用二进制平衡优化算法
Arab J Sci Eng. 2023;48(2):2359-2374. doi: 10.1007/s13369-022-07198-2. Epub 2022 Sep 26.
9
Sentiment Classification of News Text Data Using Intelligent Model.基于智能模型的新闻文本数据情感分类
Front Psychol. 2021 Sep 28;12:758967. doi: 10.3389/fpsyg.2021.758967. eCollection 2021.
10
Emotional analysis of evaluation discourse in business English translation based on language big data mining of public health environment.基于公共卫生环境语言大数据挖掘的商务英语翻译中评价语篇的情感分析。
Front Public Health. 2022 Oct 20;10:981182. doi: 10.3389/fpubh.2022.981182. eCollection 2022.