从纸质 COVID-19 评估表单中提取医学信息。

Extracting Medical Information from Paper COVID-19 Assessment Forms.

机构信息

Vanderbilt University School of Medicine, Nashville, Tennessee, United States.

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States.

出版信息

Appl Clin Inform. 2021 Jan;12(1):170-178. doi: 10.1055/s-0041-1723024. Epub 2021 Mar 10.

DOI:10.1055/s-0041-1723024

PMID:33694142

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7946598/

Abstract

OBJECTIVE

This study examines the validity of optical mark recognition, a novel user interface, and crowdsourced data validation to rapidly digitize and extract data from paper COVID-19 assessment forms at a large medical center.

METHODS

An optical mark recognition/optical character recognition (OMR/OCR) system was developed to identify fields that were selected on 2,814 paper assessment forms, each with 141 fields which were used to assess potential COVID-19 infections. A novel user interface (UI) displayed mirrored forms showing the scanned assessment forms with OMR results superimposed on the left and an editable web form on the right to improve ease of data validation. Crowdsourced participants validated the results of the OMR system. Overall error rate and time taken to validate were calculated. A subset of forms was validated by multiple participants to calculate agreement between participants.

RESULTS

The OMR/OCR tools correctly extracted data from scanned forms fields with an average accuracy of 70% and median accuracy of 78% when the OMR/OCR results were compared with the results from crowd validation. Scanned forms were crowd-validated at a mean rate of 157 seconds per document and a volume of approximately 108 documents per day. A randomly selected subset of documents was reviewed by multiple participants, producing an interobserver agreement of 97% for documents when narrative-text fields were included and 98% when only Boolean and multiple-choice fields were considered.

CONCLUSION

Due to the COVID-19 pandemic, it may be challenging for health care workers wearing personal protective equipment to interact with electronic health records. The combination of OMR/OCR technology, a novel UI, and crowdsourcing data-validation processes allowed for the efficient extraction of a large volume of paper medical documents produced during the COVID-19 pandemic.

摘要

目的

本研究旨在检验光学标记识别（一种新颖的用户界面）和众包数据验证的有效性，以便在大型医疗中心快速数字化和提取纸质 COVID-19 评估表中的数据。

方法

开发了一种光学标记识别/光学字符识别（OMR/OCR）系统，用于识别 2814 份纸质评估表中选定的字段，每份评估表包含 141 个字段，用于评估潜在的 COVID-19 感染。一种新颖的用户界面（UI）显示了镜像表单，扫描的评估表单带有叠加在左侧的 OMR 结果，以及右侧可编辑的网络表单，以提高数据验证的便利性。众包参与者验证了 OMR 系统的结果。计算了总体错误率和验证所花费的时间。通过多名参与者对表单的子集进行验证，以计算参与者之间的一致性。

结果

OMR/OCR 工具从扫描表单字段中正确提取数据，当将 OMR/OCR 结果与人群验证结果进行比较时，平均准确率为 70%，中位数准确率为 78%。扫描表单以平均每文档 157 秒的速度进行人群验证，每天可验证约 108 份文档。随机选择的文档子集由多名参与者进行审查，当包含叙述性文本字段时，文档的观察者间一致性为 97%，当仅考虑布尔和多项选择题字段时，一致性为 98%。

结论

由于 COVID-19 大流行，佩戴个人防护装备的医护人员可能难以与电子健康记录进行交互。OMR/OCR 技术、新颖的 UI 和众包数据验证过程的结合，使得在 COVID-19 大流行期间高效提取大量纸质医疗文档成为可能。

相似文献

Extracting Medical Information from Paper COVID-19 Assessment Forms.从纸质 COVID-19 评估表单中提取医学信息。

Appl Clin Inform. 2021 Jan;12(1):170-178. doi: 10.1055/s-0041-1723024. Epub 2021 Mar 10.

Extracting laboratory test information from paper-based reports.从纸质报告中提取实验室检测信息。

BMC Med Inform Decis Mak. 2023 Nov 6;23(1):251. doi: 10.1186/s12911-023-02346-6.

Data entry quality of double data entry vs automated form processing technologies: A cohort study validation of optical mark recognition and intelligent character recognition in a clinical setting.双重数据录入与自动表单处理技术的数据录入质量：一项队列研究，验证光学标记识别和智能字符识别在临床环境中的应用。

Health Sci Rep. 2020 Nov 29;3(4):e210. doi: 10.1002/hsr2.210. eCollection 2020 Dec.

Evaluation of a novel Conjunctive Exploratory Navigation Interface for consumer health information: a crowdsourced comparative study.一种用于消费者健康信息的新型联合探索性导航界面的评估：一项众包比较研究。

J Med Internet Res. 2014 Feb 10;16(2):e45. doi: 10.2196/jmir.3111.

Crowdsourcing the Citation Screening Process for Systematic Reviews: Validation Study.系统评价文献筛选过程的众包：验证研究

J Med Internet Res. 2019 Apr 29;21(4):e12953. doi: 10.2196/12953.

Simple and efficient method for region of interest value extraction from picture archiving and communication system viewer with optical character recognition software and macro program.利用光学字符识别软件和宏程序从图像存档与通信系统查看器中提取感兴趣区域值的简单高效方法。

Acad Radiol. 2015 Jan;22(1):113-6. doi: 10.1016/j.acra.2014.07.003. Epub 2014 Aug 12.

Automatic classification of scanned electronic health record documents.扫描电子健康记录文档的自动分类。

Int J Med Inform. 2020 Dec;144:104302. doi: 10.1016/j.ijmedinf.2020.104302. Epub 2020 Oct 17.

Document recognition and XML generation of tabular form discharge summaries for analogous case search system.用于类似病例搜索系统的表格形式出院小结的文档识别与XML生成

Methods Inf Med. 2007;46(6):700-8.

The effect of two different electronic health record user interfaces on intensive care provider task load, errors of cognition, and performance.两种不同电子病历用户界面对重症监护医护人员任务负荷、认知错误和绩效的影响。

Crit Care Med. 2011 Jul;39(7):1626-34. doi: 10.1097/CCM.0b013e31821858a0.

Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.基于Web 2.0的众包方式用于临床自然语言处理中高质量金标准的开发。

J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426.

引用本文的文献

Crowdsourcing Electronic Health Record Improvements at Scale across an Integrated Health Care Delivery System.大规模众包电子健康记录在一体化医疗服务系统中的改进。

Appl Clin Inform. 2023 Mar;14(2):356-364. doi: 10.1055/s-0043-1767684. Epub 2023 May 10.

DeepSSR: a deep learning system for structured recognition of text images from unstructured paper-based medical reports.深度结构化语义识别（DeepSSR）：一种用于从非结构化纸质医学报告中对文本图像进行结构化识别的深度学习系统。

Ann Transl Med. 2022 Jul;10(13):740. doi: 10.21037/atm-21-6672.

Design, Implementation, and Validation of an Automated, Algorithmic COVID-19 Triage Tool.设计、实现和验证一种自动化、算法驱动的 COVID-19 分诊工具。

Appl Clin Inform. 2021 Oct;12(5):1021-1028. doi: 10.1055/s-0041-1736627. Epub 2021 Nov 3.

本文引用的文献

Current knowledge of COVID-19 and infection prevention and control strategies in healthcare settings: A global analysis.当前对 COVID-19 的认识和医疗环境中的感染预防与控制策略：全球分析。

Infect Control Hosp Epidemiol. 2020 Oct;41(10):1196-1206. doi: 10.1017/ice.2020.237. Epub 2020 May 15.

Rapid development of telehealth capabilities within pediatric patient portal infrastructure for COVID-19 care: barriers, solutions, results.COVID-19 护理中儿科患者门户基础设施内远程医疗功能的快速发展：障碍、解决方案、结果。

J Am Med Inform Assoc. 2020 Jul 1;27(7):1116-1120. doi: 10.1093/jamia/ocaa065.

Walk-Through Screening Center for COVID-19: an Accessible and Efficient Screening System in a Pandemic Situation.新冠病毒 Walk-Through 筛查中心：大流行时期一种便于使用且高效的筛查系统。

J Korean Med Sci. 2020 Apr 20;35(15):e154. doi: 10.3346/jkms.2020.35.e154.

Crowdsourcing data to mitigate epidemics.众包数据以缓解疫情。

Lancet Digit Health. 2020 Apr;2(4):e156-e157. doi: 10.1016/S2589-7500(20)30055-8. Epub 2020 Feb 20.

Protecting healthcare workers from SARS-CoV-2 infection: practical indications.保护医护人员免受 SARS-CoV-2 感染：实用建议。

Eur Respir Rev. 2020 Apr 3;29(155). doi: 10.1183/16000617.0068-2020. Print 2020 Mar 31.

Clinical Text Data in Machine Learning: Systematic Review.机器学习中的临床文本数据：系统综述

JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.

Roadblocks to Infection Prevention Efforts in Health Care: SARS-CoV-2/COVID-19 Response.医疗保健中感染预防工作的障碍：SARS-CoV-2/COVID-19 应对。

Disaster Med Public Health Prep. 2020 Aug;14(4):538-540. doi: 10.1017/dmp.2020.55. Epub 2020 Mar 30.

Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1.与严重急性呼吸综合征冠状病毒1（SARS-CoV-1）相比，严重急性呼吸综合征冠状病毒2（SARS-CoV-2）在气溶胶和表面的稳定性

N Engl J Med. 2020 Apr 16;382(16):1564-1567. doi: 10.1056/NEJMc2004973. Epub 2020 Mar 17.

The REDCap consortium: Building an international community of software platform partners.REDCap 联盟：构建软件平台合作伙伴的国际社区。

J Biomed Inform. 2019 Jul;95:103208. doi: 10.1016/j.jbi.2019.103208. Epub 2019 May 9.

A new method of mark detection for software-based optical mark recognition.基于软件的光学标记识别的标记检测新方法。

PLoS One. 2018 Nov 9;13(11):e0206420. doi: 10.1371/journal.pone.0206420. eCollection 2018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验