Crane Paul K, Narasimhalu Kaavya, Gibbons Laura E, Mungas Dan M, Haneuse Sebastien, Larson Eric B, Kuller Lewis, Hall Kathleen, van Belle Gerald
Department of Medicine, University of Washington, Seattle, WA, USA.
J Clin Epidemiol. 2008 Oct;61(10):1018-27.e9. doi: 10.1016/j.jclinepi.2007.11.011. Epub 2008 May 5.
To cocalibrate the Mini-Mental State Examination, the Modified Mini-Mental State, the Cognitive Abilities Screening Instrument, and the Community Screening Instrument for Dementia using item response theory (IRT) to compare screening cut points used to identify cases of dementia from different studies, to compare measurement properties of the tests, and to explore the implications of these measurement properties on longitudinal studies of cognitive functioning over time.
We used cross-sectional data from three large (n>1000) community-based studies of cognitive functioning in the elderly. We used IRT to cocalibrate the scales and performed simulations of longitudinal studies.
Screening cut points varied quite widely across studies. The four tests have curvilinear scaling and varied levels of measurement precision, with more measurement error at higher levels of cognitive functioning. In longitudinal simulations, IRT scores always performed better than standard scoring, whereas a strategy to account for varying measurement precision had mixed results.
Cocalibration allows direct comparison of cognitive functioning in studies using any of these four tests. Standard scoring appears to be a poor choice for analysis of longitudinal cognitive testing data. More research is needed into the implications of varying levels of measurement precision.
使用项目反应理论(IRT)对简易精神状态检查表、改良简易精神状态检查表、认知能力筛查工具和痴呆社区筛查工具进行共同校准,以比较不同研究中用于识别痴呆病例的筛查切点,比较这些测试的测量属性,并探讨这些测量属性对认知功能随时间的纵向研究的影响。
我们使用了来自三项大型(n>1000)基于社区的老年人认知功能研究的横断面数据。我们使用IRT对量表进行共同校准,并进行了纵向研究的模拟。
不同研究中的筛查切点差异很大。这四项测试具有曲线缩放和不同程度的测量精度,在较高认知功能水平下测量误差更大。在纵向模拟中,IRT分数总是比标准评分表现更好,而考虑不同测量精度的策略结果好坏参半。
共同校准允许在使用这四项测试中的任何一项的研究中直接比较认知功能。标准评分似乎不是分析纵向认知测试数据的好选择。需要对不同测量精度水平的影响进行更多研究。