Chen Wen-Hung, Lenderking William, Jin Ying, Wyrwich Kathleen W, Gelhorn Heather, Revicki Dennis A
Center for Health Outcomes Research, United BioSource Corporation, 7101 Wisconsin Ave., Suite 600, Bethesda, MD, 20814, USA,
Qual Life Res. 2014 Mar;23(2):485-93. doi: 10.1007/s11136-013-0487-5. Epub 2013 Aug 3.
Large samples are generally considered necessary for Rasch model to obtain robust item parameter estimates. Recently, small sample Rasch analysis was suggested as preliminary assessment of items' psychometric properties. This study is to evaluate the Rasch analysis results using small sample sizes.
Ten PROMIS pain behavior items were used. Random samples of 30, 50, 100, and 250, and a targeted sample of 30 were drawn 10 times each from a total of 800 subjects. Rasch analysis was conducted for each of these samples and the full sample.
In the full sample, there were 104 cases of extreme scores, no null categories, two incorrectly ordered items, and four misfit items. For samples of 250, 100, 50, 30, and targeted 30, the average numbers of extreme scores were 42.2, 17.1, 9.6, 6.1, and 1.2; the average numbers of null categories were 1.0, 3.2, 8.7, 13.4, and 8.3; the average numbers of items with incorrectly ordered item parameters were 0.1, 0.8, 2.9, 4.7, and 3.7; and the average numbers of items with fit residuals exceeding ± 2.5 were 0.8, 0.3, 0.1, 0.2, and 0.3, respectively.
Rasch analysis based on small samples (≤ 50) identified a greater number of items with incorrectly ordered parameters than larger samples (≥ 100). However, fewer items were identified as misfitting. Results from small samples led to opposite conclusions from those based on larger samples. Rasch analysis based on small samples should be used for exploratory purposes with extreme caution.
通常认为大样本对于拉施模型获得稳健的项目参数估计是必要的。最近,小样本拉施分析被建议用于项目心理测量特性的初步评估。本研究旨在评估使用小样本量时的拉施分析结果。
使用了10个患者报告结果测量信息系统(PROMIS)疼痛行为项目。从总共800名受试者中,每次分别抽取30、50、100和250的随机样本以及30的目标样本,各抽取10次。对这些样本中的每一个以及完整样本进行拉施分析。
在完整样本中,有104例极端分数情况,无零类别,2个项目顺序错误,4个拟合不佳项目。对于250、100、50、30的样本以及目标样本30,极端分数的平均数量分别为42.2、17.1、9.6、6.1和1.2;零类别的平均数量分别为1.0、3.2、8.7、13.4和8.3;项目参数顺序错误的项目平均数量分别为0.1、0.8、2.9、4.7和3.7;拟合残差超过±2.5的项目平均数量分别为0.8、0.3、0.1、0.2和0.3。
基于小样本(≤50)的拉施分析比大样本(≥100)识别出更多项目参数顺序错误的情况。然而,被识别为拟合不佳的项目较少。小样本的结果与基于大样本的结果得出的结论相反。基于小样本的拉施分析应极其谨慎地用于探索性目的。