Sarmiento Raymond Francis, Dernoncourt Franck
U.S. Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Washington, USA
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
Retrieving information from structured data tables in a large database may be performed with little to no difficulty, but structured data may not always contain all that is needed to retrieve accurate information compared to narratives from clinical notes. The large volume of clinical notes, however, requires special processing to access the information contained in their unstructured format. In this case study, we present a comparison of two techniques (structured data extraction and natural language processing) and we evaluate their utility in identifying a specific patient cohort from a large clinical database.
从大型数据库中的结构化数据表检索信息可能几乎没有困难,甚至毫无困难,但与临床记录中的叙述相比,结构化数据可能并不总是包含检索准确信息所需的所有内容。然而,大量的临床记录需要进行特殊处理才能获取其中非结构化格式所包含的信息。在本案例研究中,我们对两种技术(结构化数据提取和自然语言处理)进行了比较,并评估了它们在从大型临床数据库中识别特定患者队列方面的效用。