• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

精神病理学评估的评分者间信度对临床试验效能及样本量计算的影响

Effects of interrater reliability of psychopathologic assessment on power and sample size calculations in clinical trials.

作者信息

Müller Matthias J, Szegedi Armin

机构信息

Department of Psychiatry, University of Mainz, Germany.

出版信息

J Clin Psychopharmacol. 2002 Jun;22(3):318-25. doi: 10.1097/00004714-200206000-00013.

DOI:10.1097/00004714-200206000-00013
PMID:12006903
Abstract

Although rater training is increasingly used to improve the quality of the investigated outcome parameters, the reliability of assessments is not perfect. Thus, empirical reliability estimates should be used instead of theoretically assumed perfect reliability. Implications of the reliability of psychiatric assessments for sample size and power calculations in clinical trials are presented. The theoretical basis of sample size and power calculations using empirical reliability scores is delineated. Examples from contemporary research on schizophrenia and depression are used to illustrate several implications for study design and interpretation of results. The tremendous impact of the lack of reliability of psychopathologic assessments on sample size, power, and detectable true score differences in clinical trials is shown. The problem of multiple outcome variables with different reliabilities is addressed. Studies lacking power because of unreliable assessments carry the risk of false-negative findings and raise ethical questions. Rater training is strongly recommended to assess and improve interrater reliability whenever necessary and possible before trials are started. Sample size calculations and power analysis should be based on empirical reliability values of outcome parameters as part of quality assurance and cost savings.

摘要

尽管评分者培训越来越多地用于提高所调查结果参数的质量,但评估的可靠性并非完美无缺。因此,应使用经验性可靠性估计值,而非理论上假定的完美可靠性。本文介绍了精神科评估的可靠性对临床试验样本量和效能计算的影响。阐述了使用经验性可靠性分数进行样本量和效能计算的理论基础。引用当代关于精神分裂症和抑郁症研究的例子来说明对研究设计和结果解释的若干影响。结果表明,心理病理学评估缺乏可靠性对临床试验的样本量、效能以及可检测到的真实分数差异有着巨大影响。文中还讨论了具有不同可靠性的多个结果变量的问题。因评估不可靠而缺乏效能的研究存在得出假阴性结果的风险,并引发伦理问题。强烈建议在试验开始前,只要有必要且有可能,就进行评分者培训,以评估和提高评分者间的可靠性。作为质量保证和节省成本的一部分,样本量计算和效能分析应基于结果参数的经验性可靠性值。

相似文献

1
Effects of interrater reliability of psychopathologic assessment on power and sample size calculations in clinical trials.精神病理学评估的评分者间信度对临床试验效能及样本量计算的影响
J Clin Psychopharmacol. 2002 Jun;22(3):318-25. doi: 10.1097/00004714-200206000-00013.
2
Comments on "Effects of interrater reliability of psychopathologic assessment on power and sample size calculations in clinical trials" by Muller and associates.对穆勒及其同事所著《精神病理学评估的评分者间信度对临床试验效能及样本量计算的影响》的评论
J Clin Psychopharmacol. 2003 Dec;23(6):681. doi: 10.1097/01.jcp.0000096255.95165.6f.
3
Stroke aetiological classification reliability and effect on trial sample size: systematic review, meta-analysis and statistical modelling.中风病因分类的可靠性及其对试验样本量的影响:系统评价、荟萃分析和统计建模
Trials. 2019 Feb 8;20(1):107. doi: 10.1186/s13063-019-3222-x.
4
Practical guide to sample size calculations: non-inferiority and equivalence trials.样本量计算实用指南:非劣效性和等效性试验
Pharm Stat. 2016 Jan-Feb;15(1):80-9. doi: 10.1002/pst.1716. Epub 2015 Nov 25.
5
Practical guide to sample size calculations: superiority trials.样本量计算实用指南:优效性试验
Pharm Stat. 2016 Jan-Feb;15(1):75-9. doi: 10.1002/pst.1718. Epub 2015 Nov 20.
6
Improved reliability of the Standardized Alzheimer's Disease Assessment Scale (SADAS) compared with the Alzheimer's Disease Assessment Scale (ADAS).与阿尔茨海默病评估量表(ADAS)相比,标准化阿尔茨海默病评估量表(SADAS)的可靠性得到了提高。
J Am Geriatr Soc. 1996 Jun;44(6):712-6. doi: 10.1111/j.1532-5415.1996.tb01838.x.
7
Peer review of the quality of care. Reliability and sources of variability for outcome and process assessments.医疗质量的同行评审。结果评估与过程评估的可靠性及变异性来源。
JAMA. 1997 Nov 19;278(19):1573-8.
8
Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials.贪小失大:测量误差对临床试验样本量要求的影响。
Biol Psychiatry. 2000 Apr 15;47(8):762-6. doi: 10.1016/s0006-3223(00)00837-4.
9
Reliability and Photographic Equivalency of the Scar Cosmesis Assessment and Rating (SCAR) Scale, an Outcome Measure for Postoperative Scars.瘢痕美观评估和分级(SCAR)量表作为术后瘢痕的一种结局测量工具,其可靠性和照片等效性。
JAMA Dermatol. 2017 Jan 1;153(1):55-60. doi: 10.1001/jamadermatol.2016.3757.
10
Non-linearity of Parkinson's disease progression: implications for sample size calculations in clinical trials.帕金森病进展的非线性:对临床试验样本量计算的影响
Clin Trials. 2005;2(6):509-18. doi: 10.1191/1740774505cn125oa.

引用本文的文献

1
Psychoeducation Group for Depression (PEG-D): Study protocol for a prospective, randomized, single-blind, crossover trial.抑郁症心理教育小组(PEG-D):一项前瞻性、随机、单盲、交叉试验的研究方案。
PLoS One. 2025 Aug 8;20(8):e0329006. doi: 10.1371/journal.pone.0329006. eCollection 2025.
2
Development of Toddlers' Smartphone Flow State Scale: Parent Report Form.发展婴幼儿智能手机流畅状态量表:家长报告表。
Int J Environ Res Public Health. 2021 Nov 11;18(22):11833. doi: 10.3390/ijerph182211833.
3
Design and conduct of confirmatory chronic pain clinical trials.
验证性慢性疼痛临床试验的设计与实施
Pain Rep. 2020 Dec 18;6(1):e845. doi: 10.1097/PR9.0000000000000854. eCollection 2021 Jan-Feb.
4
Stroke aetiological classification reliability and effect on trial sample size: systematic review, meta-analysis and statistical modelling.中风病因分类的可靠性及其对试验样本量的影响:系统评价、荟萃分析和统计建模
Trials. 2019 Feb 8;20(1):107. doi: 10.1186/s13063-019-3222-x.
5
Audio-digital recordings for surveillance in clinical trials of major depressive disorder.用于重度抑郁症临床试验监测的音频数字记录。
Contemp Clin Trials Commun. 2019 Jan 8;14:100317. doi: 10.1016/j.conctc.2019.100317. eCollection 2019 Jun.
6
Ratings surveillance and reliability in a study of major depressive disorder with subthreshold hypomania (mixed features).在一项有亚阈值轻躁狂(混合特征)的重性抑郁障碍研究中进行的评定监测和可靠性研究。
Int J Methods Psychiatr Res. 2018 Dec;27(4):e1729. doi: 10.1002/mpr.1729. Epub 2018 Jun 26.
7
Measuring pathology using the PANSS across diagnoses: Inconsistency of the positive symptom domain across schizophrenia, schizoaffective, and bipolar disorder.使用 PANSS 测量不同诊断的病理学:精神分裂症、分裂情感障碍和双相情感障碍阳性症状域的不一致性。
Psychiatry Res. 2017 Dec;258:207-216. doi: 10.1016/j.psychres.2017.08.009. Epub 2017 Aug 16.
8
The Depression Inventory Development Workgroup: A Collaborative, Empirically Driven Initiative to Develop a New Assessment Tool for Major Depressive Disorder.抑郁症量表开发工作组:一项旨在开发重度抑郁症新评估工具的协作性、基于实证的倡议。
Innov Clin Neurosci. 2016 Oct 1;13(9-10):20-31. eCollection 2016 Sep-Oct.
9
Reliability of infarct volumetry: Its relevance and the improvement by a software-assisted approach.梗死体积测量的可靠性:其相关性及软件辅助方法的改进
J Cereb Blood Flow Metab. 2017 Aug;37(8):3015-3026. doi: 10.1177/0271678X16681311. Epub 2016 Jan 1.
10
Reliability of the adult myopathy assessment tool in individuals with myositis.成人肌病评估工具在肌炎患者中的可靠性。
Arthritis Care Res (Hoboken). 2015 Apr;67(4):563-70. doi: 10.1002/acr.22473.