• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

淘金术:文本数据中政治内容自动化检测跨平台方法的比较分析。

Panning for gold: Comparative analysis of cross-platform approaches for automated detection of political content in textual data.

机构信息

Institute of Communication and Media Studies, University of Bern, Bern, Switzerland.

Social Computing Group, University of Zurich, Zurich, Switzerland.

出版信息

PLoS One. 2024 Nov 18;19(11):e0312865. doi: 10.1371/journal.pone.0312865. eCollection 2024.

DOI:10.1371/journal.pone.0312865
PMID:39556542
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11573140/
Abstract

To understand and measure political information consumption in the high-choice media environment, we need new methods to trace individual interactions with online content and novel techniques to analyse and detect politics-related information. In this paper, we report the results of a comparative analysis of the performance of automated content analysis techniques for detecting political content in the German language across different platforms. Using three validation datasets, we compare the performance of three groups of detection techniques relying on dictionaries, classic supervised machine learning, and deep learning. We also examine the impact of different modes of data preprocessing on the low-cost implementations of these techniques using a large set (n = 66) of models. Our results show the limited impact of preprocessing on model performance, with the best results for less noisy data being achieved by deep learning- and classic machine learning-based models, in contrast to the more robust performance of dictionary-based models on noisy data.

摘要

为了理解和衡量高选择媒体环境中的政治信息消费,我们需要新的方法来追踪个体与在线内容的交互,并采用新的技术来分析和检测与政治相关的信息。在本文中,我们报告了对不同平台上德语中检测政治内容的自动化内容分析技术性能进行比较分析的结果。我们使用三个验证数据集,比较了基于词典、经典监督机器学习和深度学习的三组检测技术的性能。我们还研究了不同数据预处理模式对使用大型数据集(n=66)的模型的这些技术的低成本实现的影响。我们的结果表明,预处理对模型性能的影响有限,基于深度学习和经典机器学习的模型在低噪声数据上取得了最佳效果,而基于词典的模型在噪声数据上的表现则更加稳健。

相似文献

1
Panning for gold: Comparative analysis of cross-platform approaches for automated detection of political content in textual data.淘金术:文本数据中政治内容自动化检测跨平台方法的比较分析。
PLoS One. 2024 Nov 18;19(11):e0312865. doi: 10.1371/journal.pone.0312865. eCollection 2024.
2
Threatening language detection from Urdu data with deep sequential model.基于深度序列模型的乌尔都语威胁性语言检测。
PLoS One. 2024 Jun 6;19(6):e0290915. doi: 10.1371/journal.pone.0290915. eCollection 2024.
3
A Novel Machine Learning Framework for Comparison of Viral COVID-19-Related Sina Weibo and Twitter Posts: Workflow Development and Content Analysis.一种用于比较病毒性 COVID-19 相关微博和推特帖子的新型机器学习框架:工作流程开发和内容分析。
J Med Internet Res. 2021 Jan 6;23(1):e24889. doi: 10.2196/24889.
4
MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL:一种用于机器辅助骨龄标注的新型深度学习架构。
J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.
5
COVID-19 Misinformation Detection: Machine-Learned Solutions to the Infodemic.新冠疫情错误信息检测:针对信息疫情的机器学习解决方案
JMIR Infodemiology. 2022 Aug 25;2(2):e38756. doi: 10.2196/38756. eCollection 2022 Jul-Dec.
6
Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.基于生物医学语料库预训练的句子嵌入的深度学习提高了在电子病历中查找相似句子的性能。
BMC Med Inform Decis Mak. 2020 Apr 30;20(Suppl 1):73. doi: 10.1186/s12911-020-1044-0.
7
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
8
Deep Ensemble Fake News Detection Model Using Sequential Deep Learning Technique.基于序列深度学习技术的深度集成假新闻检测模型。
Sensors (Basel). 2022 Sep 15;22(18):6970. doi: 10.3390/s22186970.
9
Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.机器学习分类器在电子烟 Twitter 监测中的应用:比较机器学习研究。
J Med Internet Res. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478.
10
Heterogeneous Ensemble Deep Learning Model for Enhanced Arabic Sentiment Analysis.用于增强阿拉伯语情感分析的异质集成深度学习模型。
Sensors (Basel). 2022 May 12;22(10):3707. doi: 10.3390/s22103707.

本文引用的文献

1
Contesting views on mobility restrictions in urban green spaces amid COVID-19-Insights from Twitter in Latin America and Spain.新冠疫情期间关于城市绿地流动限制的争议观点——来自拉丁美洲和西班牙推特的见解
Cities. 2023 Jan;132:104094. doi: 10.1016/j.cities.2022.104094. Epub 2022 Nov 10.
2
The consequences of online partisan media.网络党派媒体的后果。
Proc Natl Acad Sci U S A. 2021 Apr 6;118(14). doi: 10.1073/pnas.2013464118.
3
The influence of preprocessing on text classification using a bag-of-words representation.基于词袋模型的文本分类中预处理的影响。
PLoS One. 2020 May 1;15(5):e0232525. doi: 10.1371/journal.pone.0232525. eCollection 2020.