Leroy Gondy, Kauchak David, Hogue Alan
a Management Information Systems Department , University of Arizona , Tucson , Arizona , USA.
b Computer Science Department , Pomona College , Claremont , California , USA.
J Health Commun. 2016;21 Suppl 1(Suppl):18-26. doi: 10.1080/10810730.2015.1131775.
To help increase health literacy, we are developing a text simplification tool that creates more accessible patient education materials. Tool development is guided by a data-driven feature analysis comparing simple and difficult text. In the present study, we focus on the common advice to split long noun phrases. Our previous corpus analysis showed that easier texts contained shorter noun phrases. Subsequently, we conducted a user study to measure the difficulty of sentences containing noun phrases of different lengths (2-gram, 3-gram, and 4-gram); noun phrases of different conditions (split or not); and, to simulate unknown terms, pseudowords (present or not). We gathered 35 evaluations for 30 sentences in each condition (3 × 2 × 2 conditions) on Amazon's Mechanical Turk (N = 12,600). We conducted a 3-way analysis of variance for perceived and actual difficulty. Splitting noun phrases had a positive effect on perceived difficulty but a negative effect on actual difficulty. The presence of pseudowords increased perceived and actual difficulty. Without pseudowords, longer noun phrases led to increased perceived and actual difficulty. A follow-up study using the phrases (N = 1,350) showed that measuring awkwardness may indicate when to split noun phrases. We conclude that splitting noun phrases benefits perceived difficulty but hurts actual difficulty when the phrasing becomes less natural.
为了提高健康素养,我们正在开发一种文本简化工具,以创建更易于理解的患者教育材料。工具开发以数据驱动的特征分析为指导,该分析比较简单文本和复杂文本。在本研究中,我们关注拆分长名词短语这一常见建议。我们之前的语料库分析表明,较简单的文本包含较短的名词短语。随后,我们进行了一项用户研究,以衡量包含不同长度(2词、3词和4词)名词短语的句子的难度;不同条件(拆分或不拆分)下的名词短语;以及为模拟未知术语而设置的伪词(存在或不存在)。我们在亚马逊的Mechanical Turk上针对每种条件下的30个句子(3×2×2条件)收集了35份评估(N = 12,600)。我们对感知难度和实际难度进行了三因素方差分析。拆分名词短语对感知难度有积极影响,但对实际难度有负面影响。伪词的存在增加了感知难度和实际难度。在没有伪词的情况下,较长的名词短语会导致感知难度和实际难度增加。一项使用这些短语的后续研究(N = 1,350)表明,衡量语句的拗口程度可能会表明何时拆分名词短语。我们得出结论,拆分名词短语在感知难度方面有益,但当措辞变得不那么自然时,会损害实际难度。