Wang Bin, Lai Junkai, Cao Han, Jin Feifei, Li Qiang, Tang Mingkun, Yao Chen, Zhang Ping
School of Clinical Medicine, Tsinghua University, No. 30 Shuangqing Road, Haidian District, Beijing 100084, China.
Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancun Road, Haidian District, Beijing 100080, China.
Eur Heart J Digit Health. 2024 Sep 12;5(6):712-724. doi: 10.1093/ehjdh/ztae066. eCollection 2024 Nov.
This study aims to assess the feasibility and impact of the implementation of the ChatGLM for real-world data (RWD) extraction in hospital settings. The primary focus of this research is on the effectiveness of ChatGLM-driven data extraction compared with that of manual processes associated with the electronic source data repository (ESDR) system.
The researchers developed the ESDR system, which integrates ChatGLM, electronic case report forms (eCRFs), and electronic health records. The LLaMA (Large Language Model Meta AI) model was also deployed to compare the extraction accuracy of ChatGLM in free-text forms. A single-centre retrospective cohort study served as a pilot case. Five eCRF forms of 63 subjects, including free-text forms and discharge medication, were evaluated. Data collection involved electronic medical and prescription records collected from 13 departments. The ChatGLM-assisted process was associated with an estimated efficiency improvement of 80.7% in the eCRF data transcription time. The initial manual input accuracy for free-text forms was 99.59%, the ChatGLM data extraction accuracy was 77.13%, and the LLaMA data extraction accuracy was 43.86%. The challenges associated with the use of ChatGLM focus on prompt design, prompt output consistency, prompt output verification, and integration with hospital information systems.
The main contribution of this study is to validate the use of ESDR tools to address the interoperability and transparency challenges of using ChatGLM for RWD extraction in Chinese hospital settings.
本研究旨在评估在医院环境中实施ChatGLM进行真实世界数据(RWD)提取的可行性和影响。本研究的主要重点是将ChatGLM驱动的数据提取与电子源数据存储库(ESDR)系统相关的手动流程的有效性进行比较。
研究人员开发了ESDR系统,该系统集成了ChatGLM、电子病例报告表(eCRF)和电子健康记录。还部署了LLaMA(大型语言模型元人工智能)模型,以比较ChatGLM在自由文本形式中的提取准确性。一项单中心回顾性队列研究作为试点案例。对63名受试者的5种eCRF表格进行了评估,包括自由文本表格和出院用药。数据收集涉及从13个科室收集的电子医疗和处方记录。ChatGLM辅助流程在eCRF数据转录时间方面估计效率提高了80.7%。自由文本表格的初始手动输入准确率为99.59%,ChatGLM数据提取准确率为77.13%,LLaMA数据提取准确率为43.86%。与使用ChatGLM相关的挑战集中在提示设计、提示输出一致性、提示输出验证以及与医院信息系统的集成。
本研究的主要贡献是验证了ESDR工具在解决中国医院环境中使用ChatGLM进行RWD提取的互操作性和透明度挑战方面的应用。