Suppr超能文献

学习任务中可靠性测量的复杂性:以交替序列反应时任务为例的说明。

The complexity of measuring reliability in learning tasks: An illustration using the Alternating Serial Reaction Time Task.

机构信息

Université Paris-Saclay, UVSQ, Inserm, CESP, 94807, Villejuif, France.

Institut du Psychotraumatisme de l'Enfant et de l'Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine, CH Versailles, 78000, Versailles, France.

出版信息

Behav Res Methods. 2024 Jan;56(1):301-317. doi: 10.3758/s13428-022-02038-5. Epub 2023 Jan 5.

Abstract

Despite the fact that reliability estimation is crucial for robust inference, it is underutilized in neuroscience and cognitive psychology. Appreciating reliability can help researchers increase statistical power, effect sizes, and reproducibility, decrease the impact of measurement error, and inform methodological choices. However, accurately calculating reliability for many experimental learning tasks is challenging. In this study, we highlight a number of these issues, and estimate multiple metrics of internal consistency and split-half reliability of a widely used learning task on a large sample of 180 subjects. We show how pre-processing choices, task length, and sample size can affect reliability and its estimation. Our results show that the Alternating Serial Reaction Time Task has respectable reliability, especially when learning scores are calculated based on reaction times and two-stage averaging. We also show that a task length of 25 blocks can be sufficient to meet the usual thresholds for minimally acceptable reliability. We further illustrate how relying on a single point estimate of reliability can be misleading, and the calculation of multiple metrics, along with their uncertainties, can lead to a more complete characterization of the psychometric properties of tasks.

摘要

尽管可靠性估计对于稳健推断至关重要,但它在神经科学和认知心理学中的应用还不够充分。了解可靠性可以帮助研究人员提高统计功效、效应大小和可重复性,降低测量误差的影响,并为方法选择提供信息。然而,对于许多实验学习任务来说,准确计算可靠性是具有挑战性的。在这项研究中,我们强调了其中的一些问题,并在 180 名受试者的大样本中,对广泛使用的学习任务的多种内部一致性和半分可靠性度量进行了估计。我们展示了预处理选择、任务长度和样本量如何影响可靠性及其估计。我们的结果表明,交替序列反应时间任务具有良好的可靠性,尤其是当学习得分基于反应时间和两阶段平均计算时。我们还表明,25 个块的任务长度足以满足最小可接受可靠性的通常阈值。我们进一步说明了为什么依赖可靠性的单个点估计可能会产生误导,以及计算多个度量及其不确定性,可以更全面地描述任务的心理计量特性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/740f/10794483/f984ceafff08/13428_2022_2038_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验