Leroy Gondy, Helmreich Stephen, Cowie James R, Miller Trudi, Zheng Wei
Claremont Graduate University, Claremont, CA, USA.
AMIA Annu Symp Proc. 2008 Nov 6;2008:394-8.
Although understanding health information is important, the texts provided are often difficult to understand. There are formulas to measure readability levels, but there is little understanding of how linguistic structures contribute to these difficulties. We are developing a toolkit of linguistic metrics that are validated with representative users and can be measured automatically. In this study, we provide an overview of our corpus and how readability differs by topic and source. We compare two documents for three groups of linguistic metrics. We report on a user study evaluating one of the differentiating metrics: the percentage of function words in a sentence. Our results show that this percentage correlates significantly with ease of understanding as indicated by users but not with the readability formula levels commonly used. Our study is the first to propose a user validated metric, different from readability formulas.
虽然理解健康信息很重要,但所提供的文本往往难以理解。有一些公式来衡量可读性水平,但对于语言结构如何导致这些困难却了解甚少。我们正在开发一套语言指标工具包,该工具包已通过代表性用户进行验证,并且可以自动测量。在本研究中,我们概述了我们的语料库以及可读性如何因主题和来源而有所不同。我们针对三组语言指标比较了两篇文档。我们报告了一项用户研究,该研究评估了其中一个区分指标:句子中功能词的百分比。我们的结果表明,该百分比与用户表示的理解难易程度显著相关,但与常用的可读性公式水平无关。我们的研究首次提出了一种经用户验证的指标,不同于可读性公式。