Suppr超能文献

对文本简化的影响:名词短语拆分的评估

Effects on Text Simplification: Evaluation of Splitting Up Noun Phrases.

作者信息

Leroy Gondy, Kauchak David, Hogue Alan

机构信息

a Management Information Systems Department , University of Arizona , Tucson , Arizona , USA.

b Computer Science Department , Pomona College , Claremont , California , USA.

出版信息

J Health Commun. 2016;21 Suppl 1(Suppl):18-26. doi: 10.1080/10810730.2015.1131775.

Abstract

To help increase health literacy, we are developing a text simplification tool that creates more accessible patient education materials. Tool development is guided by a data-driven feature analysis comparing simple and difficult text. In the present study, we focus on the common advice to split long noun phrases. Our previous corpus analysis showed that easier texts contained shorter noun phrases. Subsequently, we conducted a user study to measure the difficulty of sentences containing noun phrases of different lengths (2-gram, 3-gram, and 4-gram); noun phrases of different conditions (split or not); and, to simulate unknown terms, pseudowords (present or not). We gathered 35 evaluations for 30 sentences in each condition (3 × 2 × 2 conditions) on Amazon's Mechanical Turk (N = 12,600). We conducted a 3-way analysis of variance for perceived and actual difficulty. Splitting noun phrases had a positive effect on perceived difficulty but a negative effect on actual difficulty. The presence of pseudowords increased perceived and actual difficulty. Without pseudowords, longer noun phrases led to increased perceived and actual difficulty. A follow-up study using the phrases (N = 1,350) showed that measuring awkwardness may indicate when to split noun phrases. We conclude that splitting noun phrases benefits perceived difficulty but hurts actual difficulty when the phrasing becomes less natural.

摘要

为了提高健康素养,我们正在开发一种文本简化工具,以创建更易于理解的患者教育材料。工具开发以数据驱动的特征分析为指导,该分析比较简单文本和复杂文本。在本研究中,我们关注拆分长名词短语这一常见建议。我们之前的语料库分析表明,较简单的文本包含较短的名词短语。随后,我们进行了一项用户研究,以衡量包含不同长度(2词、3词和4词)名词短语的句子的难度;不同条件(拆分或不拆分)下的名词短语;以及为模拟未知术语而设置的伪词(存在或不存在)。我们在亚马逊的Mechanical Turk上针对每种条件下的30个句子(3×2×2条件)收集了35份评估(N = 12,600)。我们对感知难度和实际难度进行了三因素方差分析。拆分名词短语对感知难度有积极影响,但对实际难度有负面影响。伪词的存在增加了感知难度和实际难度。在没有伪词的情况下,较长的名词短语会导致感知难度和实际难度增加。一项使用这些短语的后续研究(N = 1,350)表明,衡量语句的拗口程度可能会表明何时拆分名词短语。我们得出结论,拆分名词短语在感知难度方面有益,但当措辞变得不那么自然时,会损害实际难度。

相似文献

9
Measuring Text Difficulty Using Parse-Tree Frequency.利用句法树频率测量文本难度
J Assoc Inf Sci Technol. 2017 Sep;68(9):2088-2100. doi: 10.1002/asi.23855. Epub 2017 Jun 20.

本文引用的文献

6
A classification of errors in lay comprehension of medical documents.医学文献理解中常见错误的分类。
J Biomed Inform. 2012 Dec;45(6):1151-63. doi: 10.1016/j.jbi.2012.07.012. Epub 2012 Aug 20.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验