Procura-PALavras (P-PAL)：一个新的欧洲葡萄牙语词汇数据库的网络界面。

Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.

机构信息

Human Cognition Lab, CIPsi, School of Psychology, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal.

Centre for Humanistic Studies, University of Minho, Braga, Portugal.

出版信息

Behav Res Methods. 2018 Aug;50(4):1461-1481. doi: 10.3758/s13428-018-1058-z.

DOI:10.3758/s13428-018-1058-z

PMID:29855811

Abstract

In this article, we present Procura-PALavras (P-PAL), a Web-based interface for a new European Portuguese (EP) lexical database. Based on a contemporary printed corpus of over 227 million words, P-PAL provides a broad range of word attributes and statistics, including several measures of word frequency (e.g., raw counts, per-million word frequency, logarithmic Zipf scale), morpho-syntactic information (e.g., parts of speech [PoSs], grammatical gender and number, dominant PoS, and frequency and relative frequency of the dominant PoS), as well as several lexical and sublexical orthographic (e.g., number of letters; consonant-vowel orthographic structure; density and frequency of orthographic neighbors; orthographic Levenshtein distance; orthographic uniqueness point; orthographic syllabification; and trigram, bigram, and letter type and token frequencies), and phonological measures (e.g., pronunciation, number of phonemes, stress, density and frequency of phonological neighbors, transposed and phonographic neighbors, syllabification, and biphone and phone type and token frequencies) for ~53,000 lemmatized and ~208,000 nonlemmatized EP word forms. To obtain these metrics, researchers can choose between two word queries in the application: (i) analyze words previously selected for specific attributes and/or lexical and sublexical characteristics, or (ii) generate word lists that meet word requirements defined by the user in the menu of analyses. For the measures it provides and the flexibility it allows, P-PAL will be a key resource to support research in all cognitive areas that use EP verbal stimuli. P-PAL is freely available at http://p-pal.di.uminho.pt/tools .

摘要

本文介绍了 Procura-PALavras（P-PAL），这是一个基于网络的新的欧洲葡萄牙语（EP）词汇数据库的界面。基于一个包含超过 2.27 亿单词的当代印刷语料库，P-PAL 提供了广泛的词汇属性和统计数据，包括几个词汇频率度量（例如，原始计数、每百万单词频率、对数 Zipf 比例）、形态句法信息（例如，词性 [PoS]、语法性别和数、主导 PoS、以及主导 PoS 的频率和相对频率），以及几个词汇和次词汇的正字法（例如，字母数；辅音-元音正字法结构；正字法邻居的密度和频率；正字法 Levenshtein 距离；正字法独特性点；正字法音节划分；以及三字母、二字母和字母类型和标记频率）和语音学度量（例如，发音、音素数、重音、语音邻居的密度和频率、转置和语音邻居、音节划分以及双音和音素类型和标记频率），适用于大约 53,000 个词干化和大约 208,000 个非词干化的 EP 单词形式。为了获得这些指标，研究人员可以在应用程序中选择两种单词查询：（i）分析先前为特定属性和/或词汇和次词汇特征选择的单词，或（ii）生成满足用户在分析菜单中定义的单词要求的单词列表。对于它提供的度量和允许的灵活性，P-PAL 将成为支持所有使用 EP 口头刺激的认知领域研究的关键资源。P-PAL 可在 http://p-pal.di.uminho.pt/tools 上免费获得。

相似文献

Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.

Behav Res Methods. 2018 Aug;50(4):1461-1481. doi: 10.3758/s13428-018-1058-z.

ESCOLEX: a grade-level lexical database from European Portuguese elementary to middle school textbooks.

Behav Res Methods. 2014 Mar;46(1):240-53. doi: 10.3758/s13428-013-0350-1.

On the advantages of word frequency and contextual diversity measures extracted from subtitles: The case of Portuguese.

Q J Exp Psychol (Hove). 2015;68(4):680-96. doi: 10.1080/17470218.2014.964271. Epub 2014 Nov 7.

PHOR-in-One: A multilingual lexical database with PHonological, ORthographic and PHonographic word similarity estimates in four languages.

Behav Res Methods. 2023 Oct;55(7):3699-3725. doi: 10.3758/s13428-022-01985-3. Epub 2022 Nov 7.

BuscaPalabras: a program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish.

Behav Res Methods. 2005 Nov;37(4):665-71. doi: 10.3758/bf03192738.

The Minho Word Pool: Norms for imageability, concreteness, and subjective frequency for 3,800 Portuguese words.

Behav Res Methods. 2017 Jun;49(3):1065-1081. doi: 10.3758/s13428-016-0767-4.

CLEARPOND: cross-linguistic easy-access resource for phonological and orthographic neighborhood densities.

PLoS One. 2012;7(8):e43230. doi: 10.1371/journal.pone.0043230. Epub 2012 Aug 20.

Type-based bigram frequencies for five-letter words.

Behav Res Methods Instrum Comput. 2004 Aug;36(3):397-401. doi: 10.3758/bf03195587.

Phonographic neighbors, not orthographic neighbors, determine word naming latencies.

Psychon Bull Rev. 2007 Jun;14(3):455-9. doi: 10.3758/bf03194088.

The Malay Lexicon Project: a database of lexical statistics for 9,592 words.

Behav Res Methods. 2010 Nov;42(4):992-1003. doi: 10.3758/BRM.42.4.992.

引用本文的文献

Jiwar: A database and calculator for word neighborhood measures in 40 languages.

Behav Res Methods. 2025 Feb 19;57(3):98. doi: 10.3758/s13428-025-02612-7.

Development of language-specific stress discrimination in European Portuguese: an electrophysiological study.

Front Neurosci. 2024 Sep 20;18:1415854. doi: 10.3389/fnins.2024.1415854. eCollection 2024.

The role of transitional probabilities in word holistic processing.

Perception. 2024 Nov;53(11-12):775-786. doi: 10.1177/03010066241279932. Epub 2024 Sep 17.

Neural and behavioral signatures of the multidimensionality of manipulable object processing.

Commun Biol. 2023 Sep 14;6(1):940. doi: 10.1038/s42003-023-05323-x.

PHOR-in-One: A multilingual lexical database with PHonological, ORthographic and PHonographic word similarity estimates in four languages.

Behav Res Methods. 2023 Oct;55(7):3699-3725. doi: 10.3758/s13428-022-01985-3. Epub 2022 Nov 7.

The relationships between reading fluency and different measures of holistic word processing.

Atten Percept Psychophys. 2022 Jul;84(5):1734-1756. doi: 10.3758/s13414-022-02497-0. Epub 2022 May 12.

Reading Comprehension Predictors in European Portuguese Adults.

Front Psychol. 2021 Dec 2;12:789413. doi: 10.3389/fpsyg.2021.789413. eCollection 2021.

Of Beavers and Tables: The Role of Animacy in the Processing of Grammatical Gender Within a Picture-Word Interference Task.

Front Psychol. 2021 Jul 8;12:661175. doi: 10.3389/fpsyg.2021.661175. eCollection 2021.

Lexico-syntactic interactions during the processing of temporally ambiguous L2 relative clauses: An eye-tracking study with intermediate and advanced Portuguese-English bilinguals.

PLoS One. 2019 May 29;14(5):e0216779. doi: 10.1371/journal.pone.0216779. eCollection 2019.

Early Brain Sensitivity to Word Frequency and Lexicality During Reading Aloud and Implicit Reading.

Front Psychol. 2019 Apr 11;10:830. doi: 10.3389/fpsyg.2019.00830. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Procura-PALavras (P-PAL)：一个新的欧洲葡萄牙语词汇数据库的网络界面。

Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献