Lord S J, Horvath A R, Sandberg S, Monaghan P J, M Cobbaert C, Reim M, Tolios A, Mueller R, Bossuyt P M
National Health and Medical Research Council (NHMRC) Clinical Trials Centre, University of Sydney, Sydney, Australia.
New South Wales Health Pathology Department of Chemical Pathology, Prince of Wales Hospital and School of Medical Sciences, University of New South Wales; School of Public Health, University of Sydney, Australia.
Crit Rev Clin Lab Sci. 2025 May;62(3):182-197. doi: 10.1080/10408363.2025.2453148. Epub 2025 Feb 6.
Recent changes in the regulatory assessment of medical tests reflect a growing recognition of the need for more stringent clinical evidence requirements to protect patient safety and health. Under current regulations in the United States and Europe, when needed for regulatory approval, clinical performance reports must provide clinical evidence tailored to the intended purpose of the test and allow assessment of whether the test will achieve the intended clinical benefit. The quality of evidence must be proportionate to the risk for the patient and/or public health. These requirements now cover both commercial and laboratory developed tests (LDT) and demand a sound understanding of the fundamentals of clinical performance measures and study design to develop and appraise the study plan and interpret the study results. However, there is a lack of harmonized guidance for the laboratory profession, industry, regulatory agencies and notified bodies on how the clinical performance of tests should be measured. The Working Group on Test Evaluation (WG-TE) of the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) is a multidisciplinary group of laboratory professionals, clinical epidemiologists, health technology assessment experts, and representatives of the diagnostic (IVD) industry. This guidance paper aims to promote a shared understanding of the principles of clinical performance measures and study design. Measures of classification performance, also referred to as discrimination, such as sensitivity and specificity are firmly established as the primary measures for evaluating the clinical performance for screening and diagnostic tests. We explain these measures are just as relevant for other purposes of testing. We outline the importance of defining the most clinically meaningful classification of disease so the clinical benefits of testing can be explicitly inferred for those correctly classified, and harm for those incorrectly classified. We introduce the key principles and a checklist for formulating the research objective and study design to estimate clinical performance: (1) the purpose of a test e.g. diagnosis, screening, risk stratification, prognosis, prediction of treatment benefit, and corresponding research objective for assessing clinical performance; (2) the target condition for clinically meaningful classification; (3) clinical performance measures to assess whether the test is fit-for-purpose; and (4) study design types. Laboratory professionals, industry, and researchers can use this checklist to help identify relevant published studies and primary datasets, and to liaise with clinicians and methodologists when developing a study plan for evaluating clinical performance, where needed, to apply for regulatory approval.
医学检测监管评估方面的近期变化反映出,人们越来越认识到需要更严格的临床证据要求,以保护患者安全与健康。在美国和欧洲的现行法规下,若监管批准需要,临床性能报告必须提供针对检测预期用途量身定制的临床证据,并允许评估该检测是否会实现预期的临床益处。证据质量必须与患者和/或公众健康风险相称。这些要求目前涵盖商业检测和实验室自行开发的检测(LDT),并且要求对临床性能指标和研究设计的基本原理有充分理解,以便制定和评估研究计划并解读研究结果。然而,在如何衡量检测的临床性能方面,缺乏针对实验室行业、企业、监管机构和公告机构的统一指南。欧洲临床化学和检验医学联合会(EFLM)的检测评估工作组(WG-TE)是一个由实验室专业人员、临床流行病学家、卫生技术评估专家以及诊断(体外诊断)行业代表组成的多学科小组。本指南旨在促进对临床性能指标和研究设计原则的共同理解。分类性能指标,也称为鉴别力,如敏感性和特异性,已被牢固确立为评估筛查和诊断检测临床性能的主要指标。我们解释这些指标对于其他检测目的同样相关。我们概述了定义最具临床意义的疾病分类的重要性,以便能够明确推断出检测对正确分类者的临床益处以及对错误分类者的危害。我们介绍了制定研究目标和研究设计以评估临床性能的关键原则和清单:(1)检测目的,例如诊断、筛查、风险分层、预后、治疗获益预测以及评估临床性能的相应研究目标;(2)具有临床意义分类的目标疾病状态;(3)评估检测是否适用的临床性能指标;以及(4)研究设计类型。实验室专业人员、企业和研究人员可以使用此清单来帮助识别相关的已发表研究和原始数据集,并在制定评估临床性能的研究计划时(如有需要,以申请监管批准),与临床医生和方法学家进行联络。