Khosravi Bardia, Rouzrokh Pouria, Erickson Bradley J
Radiology Informatics Lab (RIL), Department of Radiology, Mayo Clinic, Rochester, Minnesota.
Orthopedic Surgery Artificial Intelligence Laboratory (OSAIL), Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota.
J Bone Joint Surg Am. 2022 Oct 19;104(Suppl 3):51-55. doi: 10.2106/JBJS.22.00567.
Electronic health records (EHRs) have created great opportunities to collect various information from clinical patient encounters. However, most EHR data are stored in unstructured form (e.g., clinical notes, surgical notes, and medication instructions), and researchers need data to be in computable form (structured) to extract meaningful relationships involving variables that can influence patient outcomes. Clinical natural language processing (NLP) is the field of extracting structured data from unstructured text documents in EHRs. Clinical text has several characteristics that mandate the use of special techniques to extract structured information from them compared with generic NLP methods. In this article, we define clinical NLP models, introduce different methods of information extraction from unstructured data using NLP, and describe the basic technical aspects of how deep learning-based NLP models work. We conclude by noting the challenges of working with clinical NLP models and summarizing the general steps needed to launch an NLP project.
电子健康记录(EHRs)为从临床患者诊疗过程中收集各类信息创造了巨大机遇。然而,大多数电子健康记录数据以非结构化形式存储(如临床记录、手术记录和用药说明),而研究人员需要数据以可计算形式(结构化)来提取涉及可能影响患者预后的变量之间的有意义关系。临床自然语言处理(NLP)是从电子健康记录中的非结构化文本文档提取结构化数据的领域。与通用自然语言处理方法相比,临床文本具有若干特性,这就要求使用特殊技术从中提取结构化信息。在本文中,我们定义临床自然语言处理模型,介绍使用自然语言处理从非结构化数据中提取信息的不同方法,并描述基于深度学习的自然语言处理模型的基本技术原理。我们通过指出使用临床自然语言处理模型的挑战并总结开展自然语言处理项目所需的一般步骤来结束本文。