单项评估中不同可靠性估计方法的比较：一项模拟研究。

Comparison of different reliability estimation methods for single-item assessment: a simulation study.

作者信息

Zhang Sijun, Colvin Kimberly

机构信息

Institute of Educational Sciences, Hunan University, Changsha, China.

School of Education, University at Albany, Albany, NY, United States.

出版信息

Front Psychol. 2024 Nov 1;15:1482016. doi: 10.3389/fpsyg.2024.1482016. eCollection 2024.

DOI:10.3389/fpsyg.2024.1482016

PMID:39554704

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11568483/

Abstract

Single-item assessments have recently become popular in various fields, and researchers have developed methods for estimating the reliability of single-item assessments, some based on factor analysis and correction for attenuation, and others using the double monotonicity model, Guttman's λ, or the latent class model. However, no empirical study has investigated which method best estimates the reliability of single-item assessments. This study investigated this question using a simulation study. To represent assessments as they are found in practice, the simulation study varied several aspects: the item discrimination parameter, the test length of the multi-item assessment of the same construct, the sample size, and the correlation between the single-item assessment and the multi-item assessment of the same construct. The results suggest that by using the method based on the double monotonicity model and the method based on correction for attenuation simultaneously, researchers can obtain the most precise estimate of the range of reliability of a single-item assessment in 94.44% of cases. The test length of a multi-item assessment of the same construct, the item discrimination parameter, the sample size, and the correlation between the single-item assessment and the multi-item assessment of the same construct did not influence the choice of method choice.

摘要

单项评估最近在各个领域变得流行起来，研究人员已经开发出了估算单项评估信度的方法，一些基于因素分析和衰减校正，另一些则使用双单调性模型、古特曼λ系数或潜在类别模型。然而，尚无实证研究调查哪种方法能最佳估算单项评估的信度。本研究通过模拟研究对这个问题进行了调查。为了模拟实际中的评估情况，模拟研究在几个方面进行了变化：项目区分参数、同一构念的多项评估的测验长度、样本量以及单项评估与同一构念的多项评估之间的相关性。结果表明，通过同时使用基于双单调性模型的方法和基于衰减校正的方法，研究人员在94.44%的情况下能够获得对单项评估信度范围的最精确估计。同一构念的多项评估的测验长度、项目区分参数、样本量以及单项评估与同一构念的多项评估之间的相关性并不影响方法的选择。

相似文献

Comparison of different reliability estimation methods for single-item assessment: a simulation study.单项评估中不同可靠性估计方法的比较：一项模拟研究。

Front Psychol. 2024 Nov 1;15:1482016. doi: 10.3389/fpsyg.2024.1482016. eCollection 2024.

Methods for Estimating Item-Score Reliability.估计项目得分信度的方法。

Appl Psychol Meas. 2018 Oct;42(7):553-570. doi: 10.1177/0146621618758290. Epub 2018 Apr 9.

Item-Score Reliability in Empirical-Data Sets and Its Relationship With Other Item Indices.实证数据集中的项目得分信度及其与其他项目指标的关系。

Educ Psychol Meas. 2018 Dec;78(6):998-1020. doi: 10.1177/0013164417728358. Epub 2017 Sep 27.

Item-Score Reliability as a Selection Tool in Test Construction.项目得分信度作为测试编制中的一种选拔工具。

Front Psychol. 2019 Jan 11;9:2298. doi: 10.3389/fpsyg.2018.02298. eCollection 2018.

The Chinese version of the Perceived Stress Questionnaire: development and validation amongst medical students and workers.中文版的感知压力问卷：医学生和医务人员的编制与验证。

Health Qual Life Outcomes. 2020 Mar 13;18(1):70. doi: 10.1186/s12955-020-01307-1.

Development and validation of health-oriented personal evaluation for the community-dwelling older person based on the International Classification of Functioning, Disability and Health.基于《国际功能、残疾和健康分类》的面向健康的社区居住老年人个人评估的制定和验证。

Int J Older People Nurs. 2024 May;19(3):e12609. doi: 10.1111/opn.12609.

Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.应用于混合格式项目测试的序贯贝叶斯能力估计

Appl Psychol Meas. 2023 Sep;47(5-6):402-419. doi: 10.1177/01466216231201986. Epub 2023 Sep 8.

Comparing Single-Item and Multi-Item Trust Scales: Insights for Assessing Trust in Project Leaders.比较单项与多项信任量表：评估对项目领导者信任的见解

Behav Sci (Basel). 2023 Sep 21;13(9):786. doi: 10.3390/bs13090786.

Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers.心理健康和幸福感问卷项目反应的莫肯量表分析：应用健康研究中实证研究的一种非参数 IRT 方法。

BMC Med Res Methodol. 2012 Jun 11;12:74. doi: 10.1186/1471-2288-12-74.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

Impact of a Single Virtual Reality Relaxation Session on Mental-Health Outcomes in Frontline Workers on Duty During the COVID-19 Pandemic: A Preliminary Study.一次虚拟现实放松训练对新冠疫情期间一线在岗工作人员心理健康结果的影响：一项初步研究

Healthcare (Basel). 2025 Jun 16;13(12):1434. doi: 10.3390/healthcare13121434.

本文引用的文献

Estimating three- and four-parameter MIRT models with importance-weighted sampling enhanced variational auto-encoder.使用重要性加权采样增强变分自编码器估计三参数和四参数MIRT模型。

Front Psychol. 2022 Aug 15;13:935419. doi: 10.3389/fpsyg.2022.935419. eCollection 2022.

Single-Item Happiness Measure Features Adequate Validity Among Adolescents.单项幸福度量表在青少年中具有充分效度。

Front Psychol. 2022 Jun 28;13:884520. doi: 10.3389/fpsyg.2022.884520. eCollection 2022.

The accuracy of reliability coefficients: A reanalysis of existing simulations.可靠性系数的准确性：对现有模拟的重新分析。

Psychol Methods. 2024 Apr;29(2):331-349. doi: 10.1037/met0000475. Epub 2022 Jan 27.

Reliability Estimation in Multidimensional Scales: Comparing the Bias of Six Estimators in Measures With a Bifactor Structure.多维量表中的信度估计：比较具有双因素结构的测量中六种估计量的偏差

Front Psychol. 2021 Jun 24;12:508287. doi: 10.3389/fpsyg.2021.508287. eCollection 2021.

A Review of Key Likert Scale Development Advances: 1995-2019.李克特量表发展关键进展综述：1995 - 2019年

Front Psychol. 2021 May 4;12:637547. doi: 10.3389/fpsyg.2021.637547. eCollection 2021.

Item-Score Reliability in Empirical-Data Sets and Its Relationship With Other Item Indices.实证数据集中的项目得分信度及其与其他项目指标的关系。

Educ Psychol Meas. 2018 Dec;78(6):998-1020. doi: 10.1177/0013164417728358. Epub 2017 Sep 27.

Validity and usefulness of a single-item measure of patient-reported bother from side effects of cancer therapy.癌症治疗副作用患者报告的单一项目测量的有效性和有用性。

Cancer. 2018 Mar 1;124(5):991-997. doi: 10.1002/cncr.31133. Epub 2017 Nov 13.

The validity of the Satisfaction with Life Scale in adolescents and a comparison with single-item life satisfaction measures: a preliminary study.青少年生活满意度量表的效度及其与单项目生活满意度测量方法的比较：一项初步研究。

Qual Life Res. 2016 Dec;25(12):3173-3180. doi: 10.1007/s11136-016-1331-5. Epub 2016 Jun 4.

Using a single item to measure burnout in primary care staff: a psychometric evaluation.使用单一项目测量基层医疗人员的职业倦怠：一项心理测量学评估

J Gen Intern Med. 2015 May;30(5):582-7. doi: 10.1007/s11606-014-3112-6. Epub 2014 Dec 2.

A basis for analyzing test-retest reliability.分析重测信度的基础。

Psychometrika. 1945;10:255-82. doi: 10.1007/BF02288892.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。