Department of Biomedical Informatics & Medical Education, University of Washington, Seattle, WA 98195, United States.
Biomedical Informatics & Data Science, Department of Medicine, Johns Hopkins University, Baltimore, MD 21218, United States.
J Am Med Inform Assoc. 2024 Oct 1;31(10):2202-2209. doi: 10.1093/jamia/ocae211.
To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.
SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.
91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.
Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.
展示两个流行的队列发现工具,Leaf 和共享健康研究信息网络(SHRINE),可以轻松实现互操作。具体来说,我们使 Leaf 适应互操作并作为使用 SHRINE 的联合数据网络中的一个节点运行,并动态生成针对异构数据模型的查询。
SHRINE 查询旨在在“集成生物学和床边信息学”(i2b2)数据模型上运行。我们在 Leaf 中创建了与 SHRINE 数据网络互操作的功能,并动态将 SHRINE 查询转换为其他数据模型。我们随机选择了 500 个过去的 SHRINE 基于的全国 Evolve to Next-Gen Accrual to Clinical Trials(ENACT)网络查询进行评估,另外 100 个查询用于完善和调试 Leaf 的翻译功能。我们为 Leaf 创建了一个脚本来将 SHRINE 查询中的术语转换为等效的结构化查询语言(SQL)概念,然后在另外两个数据模型上执行这些概念。
对于非 i2b2 模型,生成的查询中有 91.1%返回的计数在 i2b2 的 5%以内(对于计数低于 100 的计数,为 ±5 个患者),召回率为 91.3%。在超过 5%偏差的 8.9%的查询中,有 89 个中的 77 个(86.5%)是由于 Python 脚本或提取-转换-加载过程引入的错误,这些错误在生产部署中很容易修复。其余的错误是由于 Leaf 的翻译功能引起的,后来该功能被修复。
我们的结果支持队列发现应用程序,如 Leaf 和 SHRINE,可以在具有异构数据模型的联合数据网络中互操作。