一种用于文体学的复杂网络方法。

A Complex Network Approach to Stylometry.

作者信息

Amancio Diego Raphael

机构信息

Institute of Mathematical and Computer Sciences, University of São Paulo, São Carlos, São Paulo, Brazil.

出版信息

PLoS One. 2015 Aug 27;10(8):e0136076. doi: 10.1371/journal.pone.0136076. eCollection 2015.

DOI:10.1371/journal.pone.0136076

PMID:26313921

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4552030/

Abstract

Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents.

摘要

统计方法已被广泛用于研究语言的基本属性。近年来，复杂和动力系统的方法被证明有助于创建多种语言模型。尽管有大量研究致力于用物理模型表示文本，但只有少数研究表明如何利用底层物理系统的属性来提高自然语言处理任务的性能。在本文中，我通过设计能够提高当前统计方法性能的复杂网络方法来解决这个问题。使用模糊分类策略，我表明从文本中提取的拓扑属性补充了传统的文本描述。在几种情况下，混合方法获得的性能优于仅使用传统方法或网络方法时获得的结果。由于所提出的模型具有通用性，这里设计的框架可以直接用于研究类似的文本应用，其中拓扑在交互主体的描述中起着关键作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c98/4552030/c7fd1da7de2c/pone.0136076.g001.jpg

相似文献

A Complex Network Approach to Stylometry.

PLoS One. 2015 Aug 27;10(8):e0136076. doi: 10.1371/journal.pone.0136076. eCollection 2015.

Probing the topological properties of complex networks modeling short written texts.

PLoS One. 2015 Feb 26;10(2):e0118394. doi: 10.1371/journal.pone.0118394. eCollection 2015.

From computing with numbers to computing with words. From manipulation of measurements to manipulation of perceptions.

Ann N Y Acad Sci. 2001 Apr;929:221-52.

[Dynamic paradigm in psychopathology: "chaos theory", from physics to psychiatry].

Encephale. 2001 May-Jun;27(3):260-8.

GenSo-EWS: a novel neural-fuzzy based early warning system for predicting bank failures.

Neural Netw. 2004 May;17(4):567-87. doi: 10.1016/j.neunet.2003.11.006.

Constructing a fuzzy rule-based system using the ILFN network and Genetic Algorithm.

Int J Neural Syst. 2001 Oct;11(5):427-43. doi: 10.1142/S0129065701000618.

GenSo-FDSS: a neural-fuzzy decision support system for pediatric ALL cancer subtype identification using gene expression data.

Artif Intell Med. 2005 Jan;33(1):61-88. doi: 10.1016/j.artmed.2004.03.009.

H∞ consensus and synchronization of nonlinear systems based on a novel fuzzy model.

IEEE Trans Cybern. 2013 Dec;43(6):2157-69. doi: 10.1109/TCYB.2013.2242197.

Natural language processing of medical texts within the HELIOS environment.

Comput Methods Programs Biomed. 1994 Dec;45 Suppl:S79-96.

Utility of Arden Syntax for Representation of Fuzzy Logic in Clinical Quality Measures.

Stud Health Technol Inform. 2015;216:1096.

引用本文的文献

Cognitive networks detect structural patterns and emotional complexity in suicide notes.

Front Psychol. 2022 Dec 8;13:917630. doi: 10.3389/fpsyg.2022.917630. eCollection 2022.

Linguistic emergence from a networks approach: The case of modern Chinese two-character words.

PLoS One. 2021 Nov 11;16(11):e0259818. doi: 10.1371/journal.pone.0259818. eCollection 2021.

Forma mentis networks quantify crucial differences in STEM perception between students and experts.

PLoS One. 2019 Oct 17;14(10):e0222870. doi: 10.1371/journal.pone.0222870. eCollection 2019.

Complexity-entropy analysis at different levels of organisation in written language.

PLoS One. 2019 May 8;14(5):e0214863. doi: 10.1371/journal.pone.0214863. eCollection 2019.

Network motifs for translator stylometry identification.

PLoS One. 2019 Feb 8;14(2):e0211809. doi: 10.1371/journal.pone.0211809. eCollection 2019.

Functional shortcuts in language co-occurrence networks.

PLoS One. 2018 Sep 11;13(9):e0203025. doi: 10.1371/journal.pone.0203025. eCollection 2018.

Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.

PLoS One. 2018 Jun 1;13(6):e0198122. doi: 10.1371/journal.pone.0198122. eCollection 2018.

Predicting language diversity with complex networks.

PLoS One. 2018 Apr 27;13(4):e0196593. doi: 10.1371/journal.pone.0196593. eCollection 2018.

Authorship attribution based on Life-Like Network Automata.

PLoS One. 2018 Mar 22;13(3):e0193703. doi: 10.1371/journal.pone.0193703. eCollection 2018.

How does language change as a lexical network? An investigation based on written Chinese word co-occurrence networks.

PLoS One. 2018 Feb 28;13(2):e0192545. doi: 10.1371/journal.pone.0192545. eCollection 2018.

本文引用的文献

Modeling the average shortest-path length in growth of word-adjacency networks.

Phys Rev E Stat Nonlin Soft Matter Phys. 2015 Mar;91(3):032810. doi: 10.1103/PhysRevE.91.032810. Epub 2015 Mar 20.

Probing the topological properties of complex networks modeling short written texts.

PLoS One. 2015 Feb 26;10(2):e0118394. doi: 10.1371/journal.pone.0118394. eCollection 2015.

Approaching human language with complex networks.

Phys Life Rev. 2014 Dec;11(4):598-618. doi: 10.1016/j.plrev.2014.04.004. Epub 2014 Apr 18.

A systematic comparison of supervised classifiers.

PLoS One. 2014 Apr 24;9(4):e94137. doi: 10.1371/journal.pone.0094137. eCollection 2014.

Probing the statistical properties of unknown texts: application to the Voynich Manuscript.

PLoS One. 2013 Jul 2;8(7):e67310. doi: 10.1371/journal.pone.0067310. Print 2013.

Networks in cognitive science.

Trends Cogn Sci. 2013 Jul;17(7):348-60. doi: 10.1016/j.tics.2013.04.010. Epub 2013 May 30.

The evolution of the exponent of Zipf's law in language ontogeny.

PLoS One. 2013;8(3):e53227. doi: 10.1371/journal.pone.0053227. Epub 2013 Mar 13.

Deviation of Zipf's and Heaps' Laws in human languages with limited dictionary sizes.

Sci Rep. 2013;3:1082. doi: 10.1038/srep01082. Epub 2013 Jan 30.

Local-based semantic navigation on a networked representation of information.

PLoS One. 2012;7(8):e43694. doi: 10.1371/journal.pone.0043694. Epub 2012 Aug 24.

Disentangling categorical relationships through a graph of co-occurrences.

Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Oct;84(4 Pt 2):046108. doi: 10.1103/PhysRevE.84.046108. Epub 2011 Oct 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于文体学的复杂网络方法。

A Complex Network Approach to Stylometry.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献