Mayo Clinic Artificial Intelligence Laboratory, 200 1st Street SW, Rochester, MN, 55902, USA.
Mayo Clinic Department of Radiology, 200 1st Street SW, Rochester, MN, 55902, USA.
J Imaging Inform Med. 2024 Jun;37(3):1239-1247. doi: 10.1007/s10278-024-00977-3. Epub 2024 Feb 16.
Curating and integrating data from sources are bottlenecks to procuring robust training datasets for artificial intelligence (AI) models in healthcare. While numerous applications can process discrete types of clinical data, it is still time-consuming to integrate heterogenous data types. Therefore, there exists a need for more efficient retrieval and storage of curated patient data from dissimilar sources, such as biobanks, health records, and sensors. We describe a customizable, modular data retrieval application (RIL-workflow), which integrates clinical notes, images, and prescription data, and show its feasibility applied to research at our institution. It uses the workflow automation platform Camunda (Camunda Services GmbH, Berlin, Germany) to collect internal data from Fast Healthcare Interoperability Resources (FHIR) and Digital Imaging and Communications in Medicine (DICOM) sources. Using the web-based graphical user interface (GUI), the workflow runs tasks to completion according to visual representation, retrieving and storing results for patients meeting study inclusion criteria while segregating errors for human review. We showcase RIL-workflow with its library of ready-to-use modules, enabling researchers to specify human input or automation at fixed steps. We validated our workflow by demonstrating its capability to aggregate, curate, and handle errors related to data from multiple sources to generate a multimodal database for clinical AI research. Further, we solicited user feedback to highlight the pros and cons associated with RIL-workflow. The source code is available at github.com/magnooj/RIL-workflow.
从各种来源中整理和整合数据是获取医疗人工智能 (AI) 模型健壮训练数据集的瓶颈。虽然许多应用程序可以处理离散类型的临床数据,但整合异构数据类型仍然很耗时。因此,需要更有效地从不同来源(如生物库、健康记录和传感器)检索和存储经过整理的患者数据。我们描述了一个可定制的、模块化的数据检索应用程序(RIL-workflow),该应用程序集成了临床笔记、图像和处方数据,并展示了其在我们机构研究中的可行性。它使用工作流自动化平台 Camunda(Camunda Services GmbH,柏林,德国)从 Fast Healthcare Interoperability Resources (FHIR) 和 Digital Imaging and Communications in Medicine (DICOM) 源中收集内部数据。使用基于网络的图形用户界面 (GUI),工作流根据可视化表示完成任务,为符合研究纳入标准的患者检索和存储结果,同时将错误隔离以供人工审查。我们展示了 RIL-workflow 及其现成模块库,使研究人员能够在固定步骤指定人工输入或自动化。我们通过演示其从多个来源聚合、整理和处理与数据相关的错误的能力来验证我们的工作流,从而为临床 AI 研究生成一个多模态数据库。此外,我们征求了用户的反馈意见,以突出与 RIL-workflow 相关的优缺点。源代码可在 github.com/magnooj/RIL-workflow 上获得。