基于常规收集的健康数据的研究中算法的验证：一般原则。

Validation of algorithms in studies based on routinely collected health data: general principles.

机构信息

Department of Clinical Epidemiology, Department of Clinical Medicine, Aarhus University and Aarhus University Hospital, 8200 Aarhus N, Denmark.

Research Unit of Clinical Pharmacology, Pharmacy, and Environmental Medicine, University of Southern Denmark, 5230 Odense M, Denmark.

出版信息

Am J Epidemiol. 2024 Nov 4;193(11):1612-1624. doi: 10.1093/aje/kwae071.

DOI:10.1093/aje/kwae071

PMID:38754870

Abstract

Clinicians, researchers, regulators, and other decision-makers increasingly rely on evidence from real-world data (RWD), including data routinely accumulating in health and administrative databases. RWD studies often rely on algorithms to operationalize variable definitions. An algorithm is a combination of codes or concepts used to identify persons with a specific health condition or characteristic. Establishing the validity of algorithms is a prerequisite for generating valid study findings that can ultimately inform evidence-based health care. In this paper, we aim to systematize terminology, methods, and practical considerations relevant to the conduct of validation studies of RWD-based algorithms. We discuss measures of algorithm accuracy, gold/reference standards, study size, prioritization of accuracy measures, algorithm portability, and implications for interpretation. Information bias is common in epidemiologic studies, underscoring the importance of transparency in decisions regarding choice and prioritizing measures of algorithm validity. The validity of an algorithm should be judged in the context of a data source, and one size does not fit all. Prioritizing validity measures within a given data source depends on the role of a given variable in the analysis (eligibility criterion, exposure, outcome, or covariate). Validation work should be part of routine maintenance of RWD sources. This article is part of a Special Collection on Pharmacoepidemiology.

摘要

临床医生、研究人员、监管机构和其他决策者越来越依赖真实世界数据（RWD）的证据，包括在健康和行政数据库中常规积累的数据。RWD 研究通常依赖于算法来实现变量定义的操作化。算法是用于识别具有特定健康状况或特征的人员的代码或概念的组合。建立算法的有效性是生成有效研究结果的前提，这些结果最终可以为基于证据的医疗保健提供信息。在本文中，我们旨在使与基于 RWD 的算法验证研究相关的术语、方法和实际考虑因素系统化。我们讨论了算法准确性的度量、黄金/参考标准、研究规模、准确性度量的优先级、算法可移植性以及对解释的影响。在流行病学研究中，信息偏倚很常见，这突显了在选择和优先考虑算法有效性度量方面做出决策时透明度的重要性。应根据数据源来判断算法的有效性，并且一种方法并不适用于所有情况。在给定的数据源中，优先考虑有效性度量取决于给定变量在分析中的作用（入选标准、暴露、结局或协变量）。验证工作应成为 RWD 源常规维护的一部分。本文是药物流行病学特刊的一部分。

相似文献

Validation of algorithms in studies based on routinely collected health data: general principles.

Am J Epidemiol. 2024 Nov 4;193(11):1612-1624. doi: 10.1093/aje/kwae071.

Development and Evaluation of the Algorithm CErtaInty Tool (ACE-IT) to Assess Electronic Medical Record and Claims-based Algorithms' Fit for Purpose for Safety Outcomes.

Drug Saf. 2023 Jan;46(1):87-97. doi: 10.1007/s40264-022-01254-4. Epub 2022 Nov 17.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Guidance of development, validation, and evaluation of algorithms for populating health status in observational studies of routinely collected data (DEVELOP-RCD).

Mil Med Res. 2024 Aug 6;11(1):52. doi: 10.1186/s40779-024-00559-y.

A review of stakeholder recommendations for defining fit-for-purpose real-world evidence algorithms.

J Comp Eff Res. 2022 May;11(7):499-511. doi: 10.2217/cer-2022-0006. Epub 2022 Mar 17.

The accuracy of using administrative healthcare data to identify epilepsy cases: A systematic review of validation studies.

Epilepsia. 2020 Jul;61(7):1319-1335. doi: 10.1111/epi.16547. Epub 2020 May 31.

It is important to note that RWD will never replace the more traditional and more robust RCT data; however, the emerging trend is to incorporate data that are more generalizable. Introduction.

J Manag Care Pharm. 2011 Nov-Dec;17(9 Suppl A):S03-4.

Contemporary Practice and Considerations for Real-World Data Source Identification and Feasibility Assessment.

Pharmacoepidemiol Drug Saf. 2024 Sep;33(9):e5862. doi: 10.1002/pds.5862.

Core concepts in pharmacoepidemiology: Validation of health outcomes of interest within real-world healthcare databases.

Pharmacoepidemiol Drug Saf. 2023 Jan;32(1):1-8. doi: 10.1002/pds.5537. Epub 2022 Sep 14.

The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis.

BMC Med Inform Decis Mak. 2024 Feb 2;24(1):33. doi: 10.1186/s12911-024-02416-3.

引用本文的文献

Validity of diagnoses, procedures, and birth records in a Japanese administrative claims database for pediatric patients.

Pediatr Int. 2025 Jan-Dec;67(1):e70178. doi: 10.1111/ped.70178.

The AMerican PREGNANcy Mother-Child CohorT: description and prevalence of baseline outcomes and medication dispensing.

Front Pharmacol. 2025 Aug 5;16:1608403. doi: 10.3389/fphar.2025.1608403. eCollection 2025.

Identifying ventricular arrhythmia and sudden cardiac arrest in clinical notes of an electronic health record database.

Future Cardiol. 2025 Jun;21(8):593-598. doi: 10.1080/14796678.2025.2506956. Epub 2025 May 18.

An algorithm for identifying causes of reoperations after orthopedic fracture surgery in health administrative data: a diagnostic accuracy study using the Danish National Patient Register.

Acta Orthop. 2025 Jan 13;96:66-72. doi: 10.2340/17453674.2024.42633.

Algorithms to identify radiotherapy intent in unresected non-metastatic non-small-cell lung cancer: an I-O Optimise analysis.

Future Oncol. 2024;20(23):1633-1643. doi: 10.1080/14796694.2024.2363133. Epub 2024 Jun 21.

Evaluating algorithms for identifying incident Guillain-Barré Syndrome in Medicare fee-for-service claims.

Glob Epidemiol. 2024 May 3;7:100145. doi: 10.1016/j.gloepi.2024.100145. eCollection 2024 Jun.

Validity of Italian administrative healthcare data in describing the real-world utilization of infusive antineoplastic drugs: the study case of rituximab use in patients treated at the University Hospital of Siena for onco-haematological indications.

Front Oncol. 2023 May 31;13:1059109. doi: 10.3389/fonc.2023.1059109. eCollection 2023.

Development and validation of a case-finding algorithm for the identification of non-small cell lung cancers in a region-wide Italian pathology registry.

PLoS One. 2022 Jun 8;17(6):e0269232. doi: 10.1371/journal.pone.0269232. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于常规收集的健康数据的研究中算法的验证：一般原则。

Validation of algorithms in studies based on routinely collected health data: general principles.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献