Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
Hum Factors. 2025 Jan;67(1):32-48. doi: 10.1177/00187208241237862. Epub 2024 Mar 14.
In usability studies, the subjective component of usability, perceived usability, is often of interest besides the objective usability components, efficiency and effectiveness. Perceived usability is typically investigated using questionnaires. Our goal was to assess experimentally which of four perceived-usability questionnaires differing in length best reflects the difference in perceived usability between systems.
Conventional measurement wisdom strongly favors multi-item questionnaires, as measures based on more items supposedly yield better results. However, this assumption is controversial. Single-item questionnaires also have distinct advantages and it has been shown repeatedly that single-item measures can be viable alternatives to multi-item measures.
= 1089 (Experiment 1) and = 1095 (Experiment 2) participants rated the perceived usability of a good or a poor web-based mobile phone contract system using the 35-item ISONORM 9241/10 (Experiment 1 only), the 10-item System Usability Scale (SUS), the 4-item Usability Metric for User Experience (UMUX), and the single-item Adjective Rating Scale.
The Adjective Rating Scale represented the perceived-usability difference between both systems at least as good as, or significantly better than, the multi-item questionnaires (significantly better than the UMUX and the ISONORM 9241/10 in Experiment 1, significantly better than the SUS in Experiment 2).
The single-item Adjective Rating Scale is a viable alternative to multi-item perceived-usability questionnaires.
Extremely short instruments can be recommended to measure perceived usability, at least for simple user interfaces that can be considered concrete-singular in the sense that raters understand which entity is being rated and what is being rated is reasonably homogenous.
在可用性研究中,除了客观的可用性组成部分(效率和有效性)之外,主观的可用性(感知可用性)通常也是研究的关注点。感知可用性通常通过问卷进行调查。我们的目标是通过实验评估四个感知可用性问卷中的哪一个,这些问卷在长度上有所不同,最能反映系统之间感知可用性的差异。
传统的测量智慧强烈倾向于使用多项目问卷,因为基于更多项目的测量据称会产生更好的结果。然而,这种假设存在争议。单项问卷也有明显的优势,并且已经反复证明,单项测量可以作为多项目测量的可行替代方案。
= 1089(实验 1)和 = 1095(实验 2)名参与者使用 35 项 ISONORM 9241/10(仅实验 1)、10 项系统可用性量表(SUS)、4 项用户体验可用性度量(UMUX)和单项形容词评分量表对良好或较差的基于网络的移动电话合同系统的感知可用性进行了评分。
形容词评分量表至少与多项目问卷一样,或者显著更好地代表了两个系统之间的感知可用性差异(在实验 1 中显著优于 UMUX 和 ISONORM 9241/10,在实验 2 中显著优于 SUS)。
单项形容词评分量表是多项目感知可用性问卷的可行替代方案。
对于可以被认为是具体单一的简单用户界面,可以推荐使用极其简短的工具来测量感知可用性,即评分者理解正在评估的实体以及正在评估的内容是相当同质的。