Suppr超能文献

开发一个用于从电子健康记录中识别手写表单字段的光学字符识别管道。

Development of an optical character recognition pipeline for handwritten form fields from an electronic health record.

机构信息

Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin 54449, USA.

出版信息

J Am Med Inform Assoc. 2012 Jun;19(e1):e90-5. doi: 10.1136/amiajnl-2011-000182. Epub 2011 Sep 2.

Abstract

BACKGROUND

Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms.

METHODS

We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms.

OBSERVATIONS

The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%.

DISCUSSION

While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.

摘要

背景

尽管电子病历的普及率正在迅速提高,但大部分历史病历仅以手写笔记和表格的形式存在,这需要大量的人力进行图表提取,以便进行某些临床研究。之前少数关于从这些手写笔记中自动提取数据的研究都集中在单一的、定制开发的识别系统或需要专有表格的第三方系统上。

方法

我们提出了一个光学字符识别处理管道,该管道利用了现有第三方光学字符识别引擎的功能,并提供了模块化定制开发系统所提供的灵活性。该系统针对从手写眼科表格语料库中提取的一组选定的表单字段进行了配置和运行。

观察

处理管道允许运行多个配置,最佳配置由 Nuance 和 LEADTOOLS 引擎并行运行,阳性预测值为 94.6%,灵敏度为 13.5%。

讨论

尽管存在局限性,但该项目的初步经验为集成多个廉价的通用第三方光学字符识别引擎提供了一些关于可推广性和适用性的见解模块化管道。

相似文献

引用本文的文献

7
Using Electronic Health Records To Generate Phenotypes For Research.利用电子健康记录生成用于研究的表型。
Curr Protoc Hum Genet. 2019 Jan;100(1):e80. doi: 10.1002/cphg.80. Epub 2018 Dec 5.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验