Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA.
HPB (Oxford). 2010 Dec;12(10):688-95. doi: 10.1111/j.1477-2574.2010.00235.x.
Medical natural language processing (NLP) systems have been developed to identify, extract and encode information within clinical narrative text. However, the role of NLP in clinical research and patient care remains limited. Pancreatic cysts are common. Some pancreatic cysts, such as intraductal papillary mucinous neoplasms (IPMNs), have malignant potential and require extended periods of surveillance. We seek to develop a novel NLP system that could be applied in our clinical network to develop a functional registry of IPMN patients.
This study aims to validate the accuracy of our novel NLP system in the identification of surgical patients with pathologically confirmed IPMN in comparison with our pre-existing manually created surgical database (standard reference).
The Regenstrief EXtraction Tool (REX) was used to extract pancreatic cyst patient data from medical text files from Indiana University Health. The system was assessed periodically by direct sampling and review of medical records. Results were compared with the standard reference.
Natural language processing detected 5694 unique patients with pancreas cysts, in 215 of whom surgical pathology had confirmed IPMN. The NLP software identified all but seven patients present in the surgical database and identified an additional 37 IPMN patients not previously included in the surgical database. Using the standard reference, the sensitivity of the NLP program was 97.5% (95% confidence interval [CI] 94.8-98.9%) and its positive predictive value was 95.5% (95% CI 92.3-97.5%).
Natural language processing is a reliable and accurate method for identifying selected patient cohorts and may facilitate the identification and follow-up of patients with IPMN.
医学自然语言处理 (NLP) 系统已被开发用于识别、提取和编码临床叙述文本中的信息。然而,NLP 在临床研究和患者护理中的作用仍然有限。胰腺囊肿很常见。一些胰腺囊肿,如导管内乳头状黏液性肿瘤 (IPMN),具有恶性潜能,需要进行长时间的监测。我们试图开发一种新的 NLP 系统,可以在我们的临床网络中应用,为 IPMN 患者建立一个功能性登记处。
本研究旨在验证我们的新型 NLP 系统在识别经病理证实的 IPMN 手术患者方面的准确性,与我们现有的手动创建的手术数据库(标准参考)进行比较。
使用 Regenstrief EXtraction Tool (REX) 从印第安纳大学健康分校的医学文本文件中提取胰腺囊肿患者数据。该系统通过直接抽样和审查病历定期进行评估。结果与标准参考进行比较。
自然语言处理检测到 5694 名患有胰腺囊肿的独特患者,其中 215 名患者的手术病理证实为 IPMN。NLP 软件识别了除了在手术数据库中的七名患者之外的所有患者,并识别了另外 37 名以前未包含在手术数据库中的 IPMN 患者。使用标准参考,NLP 程序的灵敏度为 97.5%(95%置信区间 [CI] 94.8-98.9%),阳性预测值为 95.5%(95% CI 92.3-97.5%)。
自然语言处理是一种可靠且准确的方法,可用于识别选定的患者队列,并可能有助于识别和随访 IPMN 患者。