Zhang Hansi, He Zhe, He Xing, Guo Yi, Nelson David R, Modave François, Wu Yonghui, Hogan William, Prosperi Mattia, Bian Jiang
University of Florida, Gainesville, FL, USA.
Florida State University, Tallahassee, FL, USA.
AMIA Annu Symp Proc. 2018 Dec 5;2018:1601-1610. eCollection 2018.
The increasing adoption of electronic health record (EHR) systems and proliferation of clinical data offer unprecedented opportunities for cohort identification to accelerate patient recruitment. However, the effort required to translate trial eligibility criteria to the correct cohort identification queries for clinical investigators is substantial, at least in part due to the lack of clear definitions in both the free-text eligibility criteria and the data models used to structure the available data elements in target patient databases. We propose to adopt an ontology-driven data access approach that generates formal representations of the connections between the entities in eligibility criteria and the available data elements to (1) narrow the semantic gap between researchers' cohort identification needs and the underlying database nuances, and (2) render the eligibility criteria computable. We implemented our approach based on an analysis of the eligibility criteria from 77 Hepatitis C trials. We found that 4 major types of data manipulation queries and 4 temporal patterns covered all eligibility criteria that were computable. We built a prototype system that helps researchers write computable eligibility criteria and execute them against clinical data in real-time to find potential trial cohorts.
电子健康记录(EHR)系统的日益普及以及临床数据的激增为队列识别提供了前所未有的机会,以加速患者招募。然而,将试验纳入标准转化为临床研究人员正确的队列识别查询所需的工作量很大,至少部分原因是自由文本纳入标准和用于构建目标患者数据库中可用数据元素的数据模型都缺乏明确的定义。我们建议采用一种本体驱动的数据访问方法,该方法生成纳入标准中的实体与可用数据元素之间连接的形式化表示,以(1)缩小研究人员的队列识别需求与基础数据库细微差别之间的语义差距,以及(2)使纳入标准可计算。我们基于对77项丙型肝炎试验的纳入标准的分析实施了我们的方法。我们发现4种主要类型的数据操作查询和4种时间模式涵盖了所有可计算的纳入标准。我们构建了一个原型系统,帮助研究人员编写可计算的纳入标准,并针对临床数据实时执行这些标准以找到潜在试验队列。