Kim Brian J, Merchant Madhur, Zheng Chengyi, Thomas Anil A, Contreras Richard, Jacobsen Steven J, Chien Gary W
1 Department of Urology, Kaiser Permanente Los Angeles Medical Center , Los Angeles, California.
J Endourol. 2014 Dec;28(12):1474-8. doi: 10.1089/end.2014.0221.
Natural language processing (NLP) software programs have been widely developed to transform complex free text into simplified organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR.
An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included the TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a gold standard compiled by two blinded manual reviewers for 100 random pathology reports.
NLP demonstrated 100% accuracy for identifying the Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report.
This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases.
自然语言处理(NLP)软件程序已得到广泛开发,可将复杂的自由文本转换为简化的有条理的数据。在医学领域的潜在应用包括自动报告摘要、医生警报、患者资料库、电子病历(EMR)计费以及质量指标报告。尽管有这些前景以及近期EMR的广泛采用,但NLP的利用相对不足。本研究的目的是评估一个内部开发的NLP程序从EMR中的根治性前列腺切除术标本报告中提取特定病理结果的性能。
一名软件工程师编写了一个NLP程序,以从我们医疗系统的EMR中的前列腺切除术报告中提取关键变量,这些变量包括TNM分期、 Gleason分级、是否存在三级Gleason模式、组织学亚型、主要肿瘤结节大小、精囊侵犯(SVI)、神经周围侵犯(PNI)、血管淋巴管侵犯(ALI)、包膜外扩展(ECE)以及手术切缘状态(SMS)。通过将NLP结果与由两名盲法人工审阅者针对100份随机病理报告编制的金标准进行比较,对该程序进行了验证。
NLP在识别Gleason分级、是否存在三级Gleason模式、SVI、ALI和ECE方面显示出100%的准确率。在提取组织学亚型(99.0%)、PNI(98.9%)、TNM分期(98.0%)、SMS(97.0%)和主要肿瘤大小(95.7%)方面也显示出近乎完美 的准确率。NLP的总体准确率为98.7%。NLP在不到1秒的时间内生成结果,而人工审阅者每份报告平均用时3.2分钟。
这个新程序在从EMR系统中的前列腺切除术报告中识别关键病理细节方面显示出高准确性和效率。NLP有潜力通过总结和突出冗长病理报告中的相关信息来协助泌尿科医生。它还可能通过快速自动创建大型数据库来促进未来的泌尿科研究。