Suppr超能文献

比较作者归属的压缩模型。

Comparing compression models for authorship attribution.

机构信息

Pontifical Catholic University of Parana (PUCPR), R. Imaculada Conceição, 1155 Curitiba, PR, Brazil.

出版信息

Forensic Sci Int. 2013 May 10;228(1-3):100-4. doi: 10.1016/j.forsciint.2013.02.025. Epub 2013 Mar 24.

Abstract

In this paper we compare different compression models for authorship attribution. To this end, three different types of compressors, Lempel-Ziv type (GZip), block sorting type (BZip) and statistical type (PPM), along with two different similarity measures were considered in our experiments. Besides, two different attribution methods are analyzed in this paper. Through a series of experiments performed on two different databases, we were able to show that all the compressors behave similarly, but the similarity measures can vary considerably depending on the strategy used for authorship attribution. Our results corroborate with the literature in the sense that compression models are a good alternative for authorship attribution surpassing traditional pattern recognition systems based on classifiers and feature extraction.

摘要

在本文中,我们比较了不同的压缩模型用于作者归属分析。为此,我们在实验中考虑了三种不同类型的压缩器:Lempel-Ziv 类型(GZip)、块排序类型(BZip)和统计类型(PPM),以及两种不同的相似度度量。此外,本文还分析了两种不同的归属方法。通过在两个不同的数据库上进行的一系列实验,我们能够表明,所有的压缩器表现相似,但相似度度量可能会根据用于作者归属分析的策略而有很大的差异。我们的结果与文献一致,即压缩模型是一种很好的替代方法,用于作者归属分析,超越了基于分类器和特征提取的传统模式识别系统。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验