School of Computing, Ulster University, Newtownabbey, United Kingdom.
Organisation for the Review of Care and Health Applications, Daresbury, United Kingdom.
JMIR Mhealth Uhealth. 2022 Aug 18;10(8):e37290. doi: 10.2196/37290.
The System Usability Scale (SUS) is a widely used scale that has been used to quantify the usability of many software and hardware products. However, the SUS was not specifically designed to evaluate mobile apps, or in particular digital health apps (DHAs).
The aim of this study was to examine whether the widely used SUS distribution for benchmarking (mean 68, SD 12.5) can be used to reliably assess the usability of DHAs.
A search of the literature was performed using the ACM Digital Library, IEEE Xplore, CORE, PubMed, and Google Scholar databases to identify SUS scores related to the usability of DHAs for meta-analysis. This study included papers that published the SUS scores of the evaluated DHAs from 2011 to 2021 to get a 10-year representation. In total, 117 SUS scores for 114 DHAs were identified. R Studio and the R programming language were used to model the DHA SUS distribution, with a 1-sample, 2-tailed t test used to compare this distribution with the standard SUS distribution.
The mean SUS score when all the collected apps were included was 76.64 (SD 15.12); however, this distribution exhibited asymmetrical skewness (-0.52) and was not normally distributed according to Shapiro-Wilk test (P=.002). The mean SUS score for "physical activity" apps was 83.28 (SD 12.39) and drove the skewness. Hence, the mean SUS score for all collected apps excluding "physical activity" apps was 68.05 (SD 14.05). A 1-sample, 2-tailed t test indicated that this health app SUS distribution was not statistically significantly different from the standard SUS distribution (P=.98).
This study concludes that the SUS and the widely accepted benchmark of a mean SUS score of 68 (SD 12.5) are suitable for evaluating the usability of DHAs. We speculate as to why physical activity apps received higher SUS scores than expected. A template for reporting mean SUS scores to facilitate meta-analysis is proposed, together with future work that could be done to further examine the SUS benchmark scores for DHAs.
系统可用性量表(SUS)是一种广泛使用的量表,用于量化许多软件和硬件产品的可用性。然而,SUS 并不是专门为评估移动应用程序,或者特别是数字健康应用程序(DHAs)而设计的。
本研究旨在检验广泛使用的 SUS 基准分布(平均值 68,标准差 12.5)是否可用于可靠地评估 DHAs 的可用性。
使用 ACM 数字图书馆、IEEE Xplore、CORE、PubMed 和 Google Scholar 数据库对文献进行搜索,以确定与 DHAs 可用性相关的 SUS 评分,以进行荟萃分析。本研究包括从 2011 年到 2021 年发表的评估 DHAs 的 SUS 评分的论文,以获得 10 年的代表性。总共确定了 114 个 DHA 的 117 个 SUS 评分。使用 R Studio 和 R 编程语言对 DHA SUS 分布进行建模,使用 1 样本、2 尾 t 检验比较该分布与标准 SUS 分布。
当包含所有收集到的应用程序时,SUS 的平均得分为 76.64(标准差 15.12);然而,该分布表现出不对称的偏度(-0.52),并且根据 Shapiro-Wilk 检验不符合正态分布(P=.002)。“身体活动”应用程序的平均 SUS 得分为 83.28(标准差 12.39),并驱动了偏度。因此,不包括“身体活动”应用程序的所有收集到的应用程序的平均 SUS 得分为 68.05(标准差 14.05)。1 样本、2 尾 t 检验表明,该健康应用程序 SUS 分布与标准 SUS 分布无统计学差异(P=.98)。
本研究得出结论,SUS 和广泛接受的平均 SUS 得分为 68(标准差 12.5)的基准适用于评估 DHAs 的可用性。我们推测为什么身体活动应用程序获得的 SUS 评分高于预期。提出了一个报告平均 SUS 评分的模板,以促进荟萃分析,并提出了进一步研究 DHAs 的 SUS 基准评分的未来工作。