Swartz Jordan, Koziatek Christian, Theobald Jason, Smith Silas, Iturrate Eduardo
New York University School of Medicine, Ronald O. Perelman Department of Emergency Medicine, New York, NY, United States.
New York University School of Medicine, Ronald O. Perelman Department of Emergency Medicine, New York, NY, United States.
Int J Med Inform. 2017 May;101:93-99. doi: 10.1016/j.ijmedinf.2017.02.011. Epub 2017 Feb 21.
Testing for venous thromboembolism (VTE) is associated with cost and risk to patients (e.g. radiation). To assess the appropriateness of imaging utilization at the provider level, it is important to know that provider's diagnostic yield (percentage of tests positive for the diagnostic entity of interest). However, determining diagnostic yield typically requires either time-consuming, manual review of radiology reports or the use of complex and/or proprietary natural language processing software.
The objectives of this study were twofold: 1) to develop and implement a simple, user-configurable, and open-source natural language processing tool to classify radiology reports with high accuracy and 2) to use the results of the tool to design a provider-specific VTE imaging dashboard, consisting of both utilization rate and diagnostic yield.
Two physicians reviewed a training set of 400 lower extremity ultrasound (UTZ) and computed tomography pulmonary angiogram (CTPA) reports to understand the language used in VTE-positive and VTE-negative reports. The insights from this review informed the arguments to the five modifiable parameters of the NLP tool. A validation set of 2,000 studies was then independently classified by the reviewers and by the tool; the classifications were compared and the performance of the tool was calculated.
The tool was highly accurate in classifying the presence and absence of VTE for both the UTZ (sensitivity 95.7%; 95% CI 91.5-99.8, specificity 100%; 95% CI 100-100) and CTPA reports (sensitivity 97.1%; 95% CI 94.3-99.9, specificity 98.6%; 95% CI 97.8-99.4). The diagnostic yield was then calculated at the individual provider level and the imaging dashboard was created.
We have created a novel NLP tool designed for users without a background in computer programming, which has been used to classify venous thromboembolism reports with a high degree of accuracy. The tool is open-source and available for download at http://iturrate.com/simpleNLP. Results obtained using this tool can be applied to enhance quality by presenting information about utilization and yield to providers via an imaging dashboard.
静脉血栓栓塞症(VTE)检测会给患者带来成本和风险(如辐射)。为了在医疗服务提供者层面评估影像检查利用的合理性,了解该提供者的诊断阳性率(针对感兴趣的诊断实体检测呈阳性的比例)很重要。然而,确定诊断阳性率通常需要耗时的、对放射学报告的人工审查,或者使用复杂和/或专有的自然语言处理软件。
本研究的目的有两个:1)开发并实施一个简单的、用户可配置的开源自然语言处理工具,以高精度对放射学报告进行分类;2)利用该工具的结果设计一个针对特定医疗服务提供者的VTE影像仪表板,包括利用率和诊断阳性率。
两名医生审查了400份下肢超声(UTZ)和计算机断层扫描肺动脉造影(CTPA)报告的训练集,以了解VTE阳性和VTE阴性报告中使用的语言。此次审查得出的见解为自然语言处理工具的五个可修改参数提供了论据。然后,由审查人员和该工具对2000项研究的验证集进行独立分类;比较分类结果并计算该工具的性能。
该工具在对UTZ(敏感性95.7%;95%可信区间91.5 - 99.8,特异性100%;95%可信区间100 - 100)和CTPA报告(敏感性97.1%;95%可信区间94.3 - 99.9,特异性98.6%;95%可信区间97.8 - 99.4)中VTE的存在与否进行分类时具有很高的准确性。然后计算个体医疗服务提供者层面的诊断阳性率,并创建影像仪表板。
我们创建了一种新颖的自然语言处理工具,专为没有计算机编程背景的用户设计,该工具已被用于以高度准确性对静脉血栓栓塞报告进行分类。该工具是开源的,可在http://iturrate.com/simpleNLP下载。使用该工具获得的结果可通过影像仪表板向医疗服务提供者呈现有关利用率和阳性率的信息,从而应用于提高质量。