Pérez Pozo Álvaro, de la Rosa Javier, Ros Salvador, González-Blanco Elena, Hernández Laura, de Sisto Mirella
Universidad Nacional de Educación a Distancia Madrid Spain.
IE University Madrid Spain.
J Assoc Inf Sci Technol. 2022 Feb;73(2):258-267. doi: 10.1002/asi.24532. Epub 2021 Jun 14.
The rise in artificial intelligence and natural language processing techniques has increased considerably in the last few decades. Historically, the focus has been primarily on texts expressed in prose form, leaving mostly aside figurative or poetic expressions of language due to their rich semantics and syntactic complexity. The creation and analysis of poetry have been commonly carried out by hand, with a few computer-assisted approaches. In the Spanish context, the promise of machine learning is starting to pan out in specific tasks such as metrical annotation and syllabification. However, there is a task that remains unexplored and underdeveloped: stanza classification. This classification of the inner structures of verses in which a poem is built upon is an especially relevant task for poetry studies since it complements the structural information of a poem. In this work, we analyzed different computational approaches to stanza classification in the Spanish poetic tradition. These approaches show that this task continues to be hard for computers systems, both based on classical machine learning approaches as well as statistical language models and cannot compete with traditional computational paradigms based on the knowledge of experts.
在过去几十年里,人工智能和自然语言处理技术有了显著发展。从历史上看,主要关注的是散文形式表达的文本,由于其丰富的语义和句法复杂性,语言的比喻或诗意表达大多被搁置一旁。诗歌的创作和分析通常是手工进行的,只有少数计算机辅助方法。在西班牙语语境中,机器学习的前景开始在韵律标注和音节划分等特定任务中得到体现。然而,有一项任务仍未得到充分探索和发展:诗节分类。对构成诗歌的诗句内部结构进行这种分类,对于诗歌研究来说是一项特别重要的任务,因为它补充了诗歌的结构信息。在这项工作中,我们分析了西班牙语诗歌传统中诗节分类的不同计算方法。这些方法表明,无论是基于经典机器学习方法还是统计语言模型,这项任务对计算机系统来说仍然很困难,并且无法与基于专家知识的传统计算范式相竞争。