Institute of Medical Informatics, University of Münster, Münster, Germany.
Department of Internal Medicine (D), University Hospital of Münster, Münster, Germany.
BMC Med Res Methodol. 2022 May 14;22(1):141. doi: 10.1186/s12874-022-01611-y.
BACKGROUND: Screening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. Logical Observation Identifiers Names and Codes (LOINC®), is much needed to support automated screening tools. OBJECTIVE: The aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment. METHODS: We used a semi-automated approach to analyze 10,516 screening forms from the Medical Data Models (MDM) portal's data repository that are pre-annotated with Unified Medical Language System (UMLS). An automated semantic analysis based on concept frequency is followed by an extensive manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach. RESULTS: Based on analysis of 138,225 EC from 10,516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26,413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of Medical Subject Headings (MeSH) disease domains. CONCLUSIONS: Only a small set of common LP covers the majority of laboratory concepts in screening EC forms which supports the feasibility of establishing a focused core dataset for LP. We present ELaPro, a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials. ELaPro is available in multiple machine-readable data formats like CSV, ODM and HL7 FHIR. The extensive manual curation of this large number of free-text EC as well as the combining of UMLS and LOINC terminologies distinguishes this specialized dataset from previous relevant datasets in the literature.
背景:为合格患者筛选仍然是许多临床试验面临的巨大挑战。这导致人们越来越关注将资格标准(EC)的可计算表示标准化,以便开发利用电子健康记录(EHR)系统数据的工具。尽管实验室程序(LP)是 EC 的常见实体,可从 EHR 系统中轻松获取和检索,但缺乏针对该 EC 实体的互操作数据模型。需要一个公共的、专门的数据模型,该模型使用 LP 的国际、广泛采用的术语,例如逻辑观察标识符命名和代码(LOINC®),以支持自动化筛选工具。
目的:本研究旨在建立一个核心数据集,用于使用 LOINC 术语最常请求招募临床试验患者的 LP。采用这样的核心数据集可以增强研究可行性平台和 EHR 系统之间的接口,并显著提高自动患者招募的效率。
方法:我们使用半自动方法分析了来自 Medical Data Models (MDM) 门户的数据存储库中的 10516 份筛选表单,这些表单预先使用统一医学语言系统(UMLS)进行了注释。基于概念频率的自动语义分析之后,由医生进行广泛的手动专家审查,以分析不适合自动方法的复杂招募相关概念。
结果:基于对 10516 份筛选表单中 138225 条 EC 的分析,55 项实验室程序代表了在选定的 EC 表单中确定的所有 UMLS 实验室概念出现的 77.87%。我们从 118 个 UMLS 语义类型中识别出 26413 个唯一的 UMLS 概念,并涵盖了绝大多数医学主题词(MeSH)疾病领域。
结论:只有一小部分常见的 LP 涵盖了筛选 EC 表单中大多数的实验室概念,这支持了为 LP 建立一个重点核心数据集的可行性。我们提出了 ELaPro,这是一个新颖的、LOINC 映射的核心数据集,用于筛选临床试验中最常请求的 55 项 LP。ELaPro 以多种机器可读数据格式提供,如 CSV、ODM 和 HL7 FHIR。对大量自由文本 EC 的广泛手动整理以及 UMLS 和 LOINC 术语的结合,使这个专门数据集区别于文献中以前的相关数据集。
BMC Med Res Methodol. 2022-5-14
AMIA Annu Symp Proc. 2006
AMIA Annu Symp Proc. 2008-11-6
JMIR Med Inform. 2014-3-18
Stud Health Technol Inform. 2019-9-3
J Am Med Inform Assoc. 2018-10-1
Stud Health Technol Inform. 2011
Methods Inf Med. 2016
J Am Med Inform Assoc. 2023-11-17
BMC Med Inform Decis Mak. 2021-5-17
Cochrane Database Syst Rev. 2020-10-7
J Am Med Inform Assoc. 2020-10-1
J Am Med Inform Assoc. 2020-10-1