Mehrabi Saeed, Schmidt C Max, Waters Joshua A, Beesley Chris, Krishnan Anand, Kesterson Joe, Dexter Paul, Al-Haddad Mohammed A, Tierney William M, Palakal Mathew
School of Informatics, Indiana University, Indianapolis, IN, USA.
Stud Health Technol Inform. 2013;192:822-6.
Pancreatic cancer is one of the deadliest cancers, mostly diagnosed at late stages. Patients with pancreatic cysts are at higher risk of developing cancer and their surveillance can help to diagnose the disease in earlier stages. In this retrospective study we collected a corpus of 1064 records from 44 patients at Indiana University Hospital from 1990 to 2012. A Natural Language Processing (NLP) system was developed and used to identify patients with pancreatic cysts. NegEx algorithm was used initially to identify the negation status of concepts that resulted in precision and recall of 98.9% and 89% respectively. Stanford Dependency parser (SDP) was then used to improve the NegEx performance resulting in precision of 98.9% and recall of 95.7%. Features related to pancreatic cysts were also extracted from patient medical records using regex and NegEx algorithm with 98.5% precision and 97.43% recall. SDP improved the NegEx algorithm by increasing the recall to 98.12%.
胰腺癌是最致命的癌症之一,大多在晚期被诊断出来。患有胰腺囊肿的患者患癌风险更高,对他们进行监测有助于在疾病早期进行诊断。在这项回顾性研究中,我们收集了1990年至2012年期间印第安纳大学医院44名患者的1064份记录。开发了一个自然语言处理(NLP)系统并用于识别患有胰腺囊肿的患者。最初使用NegEx算法来识别概念的否定状态,其精确率和召回率分别为98.9%和89%。随后使用斯坦福依存句法分析器(SDP)来提高NegEx的性能,精确率达到98.9%,召回率达到95.7%。还使用正则表达式和NegEx算法从患者病历中提取与胰腺囊肿相关的特征,精确率为98.5%,召回率为97.43%。SDP通过将召回率提高到98.12%改进了NegEx算法。