Suppr超能文献

慢性乙型肝炎真实世界队列研究的自动化数据收集工具:利用光学字符识别和自然语言处理技术提高效率。

Automated data collection tool for real-world cohort studies of chronic hepatitis B: Leveraging OCR and NLP technologies for improved efficiency.

作者信息

Zhou Xiaomei, Zeng Tao, Zhang Yibo, Liao Yingying, Smith Jaime, Zhang Lin, Wang Chao, Li Qinghai, Wu Dongbo, Chong Yutian, Li Xinhua

机构信息

Information Center, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510630, China.

Department of Infectious Diseases, The Third Affiliated Hospital of Sun Yat-sen University, Guangdong Key Laboratory of Liver Disease, Guangzhou, 510630, China.

出版信息

New Microbes New Infect. 2024 Aug 28;62:101469. doi: 10.1016/j.nmni.2024.101469. eCollection 2024 Dec.

Abstract

BACKGROUND

Collecting and standardizing clinical research data is a very tedious task. This study is to develop an intelligent data collection tool, named CHB-EDC, for real-world cohort studies of chronic hepatitis B (CHB), which can assist in standardized and efficient data collection.

METHODS

CHB_EDC is capable of automatically processing various formats of data, including raw data in image format, using internationally recognized data standards, OCR, and NLP models. It can automatically populate the data into eCRFs designed in the REDCap system, supporting the integration of patient data from electronic medical record systems through commonly used web application interfaces. This tool enables intelligent extraction and aggregation of data, as well as secure and anonymous data sharing.

RESULTS

For non-electronic data collection, the average accuracy of manual collection was 98.65 %, with an average time of 63.64 min to collect information for one patient. The average accuracy CHB-EDC was 98.66 %, with an average time of 3.57 min to collect information for one patient. In the same data collection task, CHB-EDC achieved a comparable average accuracy to manual collection. However, in terms of time, CHB-EDC significantly outperformed manual collection (p < 0.05). Our research has significantly reduced the required collection time and lowered the cost of data collection while ensuring accuracy.

CONCLUSION

The tool has significantly improved the efficiency of data collection while ensuring accuracy, enabling standardized collection of real-world data.

摘要

背景

收集和规范临床研究数据是一项非常繁琐的任务。本研究旨在开发一种名为CHB-EDC的智能数据收集工具,用于慢性乙型肝炎(CHB)的真实世界队列研究,该工具可协助进行标准化且高效的数据收集。

方法

CHB_EDC能够使用国际认可的数据标准、光学字符识别(OCR)和自然语言处理(NLP)模型自动处理各种格式的数据,包括图像格式的原始数据。它可以自动将数据填充到REDCap系统中设计的电子病例报告表(eCRF)中,支持通过常用的网络应用程序接口集成来自电子病历系统的患者数据。该工具能够实现数据的智能提取和汇总,以及安全且匿名的数据共享。

结果

对于非电子数据收集,人工收集的平均准确率为98.65%,为一名患者收集信息的平均时间为63.64分钟。CHB-EDC的平均准确率为98.66%,为一名患者收集信息的平均时间为3.57分钟。在相同的数据收集任务中,CHB-EDC的平均准确率与人工收集相当。然而,在时间方面,CHB-EDC明显优于人工收集(p < 0.05)。我们的研究在确保准确性的同时,显著减少了所需的收集时间并降低了数据收集成本。

结论

该工具在确保准确性的同时显著提高了数据收集效率,能够实现真实世界数据的标准化收集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbbf/11402059/0c040fb40e08/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验