Frommholz Ingo, Al-Khateeb Haider M, Potthast Martin, Ghasem Zinnar, Shukla Mitul, Short Emma
The National Centre for Cyberstalking Research, Institute for Research in Applicable Computing, University of Bedfordshire, Luton, UK.
Web Technology and Information Systems, Bauhaus-Universität Weimar, Weimar, Germany.
Datenbank Spektrum. 2016;16(2):127-135. doi: 10.1007/s13222-016-0221-x. Epub 2016 Jun 1.
Cyber security has become a major concern for users and businesses alike. Cyberstalking and harassment have been identified as a growing anti-social problem. Besides detecting cyberstalking and harassment, there is the need to gather digital evidence, often by the victim. To this end, we provide an overview of and discuss relevant technological means, in particular coming from text analytics as well as machine learning, that are capable to address the above challenges. We present a framework for the detection of text-based cyberstalking and the role and challenges of some core techniques such as author identification, text classification and personalisation. We then discuss PAN, a network and evaluation initiative that focusses on digital text forensics, in particular author identification.
网络安全已成为用户和企业共同关注的主要问题。网络跟踪和骚扰已被认定为一个日益严重的反社会问题。除了检测网络跟踪和骚扰行为,还需要收集数字证据,通常由受害者来收集。为此,我们概述并讨论了相关技术手段,特别是来自文本分析和机器学习的手段,这些手段能够应对上述挑战。我们提出了一个基于文本的网络跟踪检测框架,以及一些核心技术(如作者识别、文本分类和个性化)的作用和挑战。然后,我们讨论了PAN,这是一项专注于数字文本取证(特别是作者识别)的网络和评估倡议。