Suppr超能文献

经典与非经典英语小说及非虚构文本中的分形性与变异性

Fractality and Variability in Canonical and Non-Canonical English Fiction and in Non-Fictional Texts.

作者信息

Mohseni Mahdi, Gast Volker, Redies Christoph

机构信息

Experimental Aesthetics Group, Institute of Anatomy I, Jena University Hospital, University of Jena, Jena, Germany.

Department of English and American Studies, University of Jena, Jena, Germany.

出版信息

Front Psychol. 2021 Mar 31;12:599063. doi: 10.3389/fpsyg.2021.599063. eCollection 2021.

Abstract

This study investigates global properties of three categories of English text: canonical fiction, non-canonical fiction, and non-fictional texts. The central hypothesis of the study is that there are systematic differences with respect to structural design features between canonical and non-canonical fiction, and between fictional and non-fictional texts. To investigate these differences, we compiled a corpus containing texts of the three categories of interest, the Jena Corpus of Expository and Fictional Prose (JEFP Corpus). Two aspects of global structure are investigated, variability and self-similar (fractal) patterns, which reflect long-range correlations along texts. We use four types of basic observations, (i) the frequency of POS-tags per sentence, (ii) sentence length, (iii) lexical diversity, and (iv) the distribution of topic probabilities in segments of texts. These basic observations are grouped into two more general categories, (a) the lower-level properties (i) and (ii), which are observed at the level of the sentence (reflecting linguistic decoding), and (b) the higher-level properties (iii) and (iv), which are observed at the textual level (reflecting comprehension/integration). The observations for each property are transformed into series, which are analyzed in terms of variance and subjected to Multi-Fractal Detrended Fluctuation Analysis (MFDFA), giving rise to three statistics: (i) the degree of fractality ( ), (ii) the degree of multifractality ( ), i.e., the width of the fractal spectrum, and (iii) the degree of asymmetry ( ) of the fractal spectrum. The statistics thus obtained are compared individually across text categories and jointly fed into a classification model (Support Vector Machine). Our results show that there are in fact differences between the three text categories of interest. In general, lower-level text properties are better discriminators than higher-level text properties. Canonical fictional texts differ from non-canonical ones primarily in terms of variability in lower-level text properties. Fractality seems to be a universal feature of text, slightly more pronounced in non-fictional than in fictional texts. On the basis of our results obtained on the basis of corpus data we point out some avenues for future research leading toward a more comprehensive analysis of textual aesthetics, e.g., using experimental methodologies.

摘要

本研究调查了三类英语文本的全局属性

经典小说、非经典小说和非虚构文本。该研究的核心假设是,在结构设计特征方面,经典小说与非经典小说之间,以及虚构文本与非虚构文本之间存在系统性差异。为了研究这些差异,我们编制了一个包含这三类感兴趣文本的语料库,即耶拿说明文与虚构散文语料库(JEFP语料库)。我们研究了全局结构的两个方面,即变异性和自相似(分形)模式,它们反映了文本中的长程相关性。我们使用四种基本观察方法:(i)每句话词性标注的频率,(ii)句子长度,(iii)词汇多样性,以及(iv)文本片段中主题概率的分布。这些基本观察方法被归为两个更一般的类别:(a)较低层次的属性(i)和(ii),它们在句子层面被观察到(反映语言解码);(b)较高层次的属性(iii)和(iv),它们在文本层面被观察到(反映理解/整合)。每个属性的观察结果被转换为序列,对其进行方差分析,并进行多重分形去趋势波动分析(MFDFA),得出三个统计量:(i)分形维数( ),(ii)多重分形维数( ),即分形谱的宽度,以及(iii)分形谱的不对称度( )。由此得到的统计量在不同文本类别之间进行单独比较,并共同输入到一个分类模型(支持向量机)中。我们的结果表明,这三类感兴趣的文本之间确实存在差异。一般来说,较低层次的文本属性比高层次的文本属性更具区分性。经典虚构文本与非经典虚构文本的主要区别在于较低层次文本属性的变异性。分形似乎是文本的一个普遍特征,在非虚构文本中比在虚构文本中稍显明显。基于我们从语料库数据中获得的结果,我们指出了一些未来研究的方向,以朝着对文本美学进行更全面的分析,例如使用实验方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e3/8044424/370f6f8c1808/fpsyg-12-599063-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验