Department of Psychological Research Methods, Ulm University.
Department of Psychological Methods, University of Amsterdam.
Multivariate Behav Res. 2022 Jul-Aug;57(4):620-641. doi: 10.1080/00273171.2021.1891855. Epub 2021 Mar 24.
Popular measures of reliability for a single-test administration include coefficient , coefficient , the greatest lower bound (glb), and coefficient . First, we show how these measures can be easily estimated within a Bayesian framework. Specifically, the posterior distribution for these measures can be obtained through Gibbs sampling - for coefficients , , and the glb one can sample the covariance matrix from an inverse Wishart distribution; for coefficient one samples the conditional posterior distributions from a single-factor CFA-model. Simulations show that - under relatively uninformative priors - the 95% Bayesian credible intervals are highly similar to the 95% frequentist bootstrap confidence intervals. In addition, the posterior distribution can be used to address practically relevant questions, such as "what is the probability that the reliability of this test is between .70 and .90?", or, "how likely is it that the reliability of this test is higher than .80?" In general, the use of a posterior distribution highlights the inherent uncertainty with respect to the estimation of reliability measures.
信度的常用度量方法包括系数、系数、下界(glb)和系数。首先,我们展示如何在贝叶斯框架内轻松估算这些度量方法。具体来说,可以通过吉布斯抽样获得这些度量的后验分布;对于系数、和 glb,可以从逆 Wishart 分布中抽样协方差矩阵;对于系数,可以从单因素 CFA 模型中抽样条件后验分布。模拟表明,在相对无信息的先验下,95%贝叶斯可信区间与 95%频率 bootstrap 置信区间非常相似。此外,后验分布可用于解决实际相关问题,例如“该测试的可靠性在 0.70 和 0.90 之间的概率是多少?”,或者“该测试的可靠性高于 0.80 的可能性是多少?”一般来说,后验分布突出了对可靠性度量估计的固有不确定性。