• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

社交媒体帖子中隐私敏感内容的分析与分类。

Analysis and classification of privacy-sensitive content in social media posts.

作者信息

Bioglio Livio, Pensa Ruggero G

机构信息

University of Turin, C.So Svizzera, 185, I-10149 Turin, Italy.

出版信息

EPJ Data Sci. 2022;11(1):12. doi: 10.1140/epjds/s13688-022-00324-y. Epub 2022 Mar 3.

DOI:10.1140/epjds/s13688-022-00324-y
PMID:35261872
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8892403/
Abstract

User-generated contents often contain private information, even when they are shared publicly on social media and on the web in general. Although many filtering and natural language approaches for automatically detecting obscenities or hate speech have been proposed, determining whether a shared post contains sensitive information is still an open issue. The problem has been addressed by assuming, for instance, that sensitive contents are published anonymously, on anonymous social media platforms or with more restrictive privacy settings, but these assumptions are far from being realistic, since the authors of posts often underestimate or overlook their actual exposure to privacy risks. Hence, in this paper, we address the problem of content sensitivity analysis directly, by presenting and characterizing a new annotated corpus with around ten thousand posts, each one annotated as sensitive or non-sensitive by a pool of experts. We characterize our data with respect to the closely-related problem of self-disclosure, pointing out the main differences between the two tasks. We also present the results of several deep neural network models that outperform previous naive attempts of classifying social media posts according to their sensitivity, and show that state-of-the-art approaches based on anonymity and lexical analysis do not work in realistic application scenarios.

摘要

用户生成的内容通常包含私人信息,即使这些内容在社交媒体和网络上公开发布也是如此。尽管已经提出了许多用于自动检测淫秽或仇恨言论的过滤和自然语言方法,但确定共享帖子是否包含敏感信息仍然是一个悬而未决的问题。例如,有人通过假设敏感内容是在匿名社交媒体平台上匿名发布的,或者是在隐私设置更为严格的情况下发布的来解决这个问题,但这些假设远非现实,因为帖子的作者往往低估或忽视了他们实际面临的隐私风险。因此,在本文中,我们直接解决内容敏感性分析问题,通过展示和描述一个新的带注释语料库,该语料库包含约一万个帖子,每个帖子都由一组专家注释为敏感或不敏感。我们针对与自我披露密切相关的问题对我们的数据进行了特征描述,指出了这两项任务之间的主要区别。我们还展示了几个深度神经网络模型的结果,这些模型优于以往根据敏感性对社交媒体帖子进行分类的简单尝试,并表明基于匿名性和词汇分析的最先进方法在实际应用场景中不起作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee1/8892403/99253564496b/13688_2022_324_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee1/8892403/e9c3c5112ea4/13688_2022_324_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee1/8892403/99253564496b/13688_2022_324_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee1/8892403/e9c3c5112ea4/13688_2022_324_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eee1/8892403/99253564496b/13688_2022_324_Fig2_HTML.jpg

相似文献

1
Analysis and classification of privacy-sensitive content in social media posts.社交媒体帖子中隐私敏感内容的分析与分类。
EPJ Data Sci. 2022;11(1):12. doi: 10.1140/epjds/s13688-022-00324-y. Epub 2022 Mar 3.
2
Social Media Content About Children's Pain and Sleep: Content and Network Analysis.关于儿童疼痛与睡眠的社交媒体内容:内容与网络分析
JMIR Pediatr Parent. 2018 Dec 11;1(2):e11193. doi: 10.2196/11193.
3
Combating hate speech using an adaptive ensemble learning model with a case study on COVID-19.使用自适应集成学习模型打击仇恨言论——以新冠疫情为例的案例研究
Expert Syst Appl. 2021 Dec 15;185:115632. doi: 10.1016/j.eswa.2021.115632. Epub 2021 Jul 27.
4
Detection of Hate Speech in COVID-19-Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach.检测阿拉伯地区与 COVID-19 相关推文的仇恨言论:深度学习和主题建模方法。
J Med Internet Res. 2020 Dec 8;22(12):e22609. doi: 10.2196/22609.
5
Classification of Health-Related Social Media Posts: Evaluation of Post Content-Classifier Models and Analysis of User Demographics.健康相关社交媒体帖子的分类:帖子内容分类模型的评估和用户人口统计学分析。
JMIR Public Health Surveill. 2020 Apr 1;6(2):e14952. doi: 10.2196/14952.
6
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
7
Health issue identification in social media based on multi-task hierarchical neural networks with topic attention.基于多任务层次神经网络和主题注意力的社交媒体健康问题识别。
Artif Intell Med. 2021 Aug;118:102119. doi: 10.1016/j.artmed.2021.102119. Epub 2021 May 31.
8
PREDOSE: a semantic web platform for drug abuse epidemiology using social media.前置:一个利用社交媒体进行药物滥用流行病学研究的语义网平台。
J Biomed Inform. 2013 Dec;46(6):985-97. doi: 10.1016/j.jbi.2013.07.007. Epub 2013 Jul 25.
9
Characterizing and Identifying the Prevalence of Web-Based Misinformation Relating to Medication for Opioid Use Disorder: Machine Learning Approach.描述和识别与阿片类药物使用障碍药物治疗相关的网络错误信息的流行情况:机器学习方法。
J Med Internet Res. 2021 Dec 22;23(12):e30753. doi: 10.2196/30753.
10
[Utilizing social media data in post-market safety surveillance].[在上市后安全监测中利用社交媒体数据]
Beijing Da Xue Xue Bao Yi Xue Ban. 2021 Jun 18;53(3):623-627. doi: 10.19723/j.issn.1671-167X.2021.03.031.

引用本文的文献

1
Predicting social media users' indirect aggression through pre-trained models.通过预训练模型预测社交媒体用户的间接攻击行为。
PeerJ Comput Sci. 2024 Sep 2;10:e2292. doi: 10.7717/peerj-cs.2292. eCollection 2024.

本文引用的文献

1
Self-disclosure and Channel Difference in Online Health Support Groups.在线健康支持群组中的自我表露与渠道差异
Proc Int AAAI Conf Weblogs Soc Media. 2017 May;2017:704-707.
2
Predicting Depression From Language-Based Emotion Dynamics: Longitudinal Analysis of Facebook and Twitter Status Updates.基于语言的情绪动态预测抑郁症:对脸书和推特状态更新的纵向分析。
J Med Internet Res. 2018 May 8;20(5):e168. doi: 10.2196/jmir.9267.
3
Private traits and attributes are predictable from digital records of human behavior.个人特质和属性可从人类行为的数字记录中预测出来。
Proc Natl Acad Sci U S A. 2013 Apr 9;110(15):5802-5. doi: 10.1073/pnas.1218772110. Epub 2013 Mar 11.
4
Degree and reciprocity of self-disclosure in online forums.在线论坛中自我表露的程度与相互性。
Cyberpsychol Behav. 2007 Jun;10(3):407-17. doi: 10.1089/cpb.2006.9938.