Division of Research and Systems Research Initiative, Kaiser Permanente, 2000 Broadway, Webster Annex, Oakland, CA 94612, USA.
BMC Med Inform Decis Mak. 2013 Aug 15;13:90. doi: 10.1186/1472-6947-13-90.
Prior studies demonstrate the suitability of natural language processing (NLP) for identifying pneumonia in chest radiograph (CXR) reports, however, few evaluate this approach in intensive care unit (ICU) patients.
From a total of 194,615 ICU reports, we empirically developed a lexicon to categorize pneumonia-relevant terms and uncertainty profiles. We encoded lexicon items into unique queries within an NLP software application and designed an algorithm to assign automated interpretations ('positive', 'possible', or 'negative') based on each report's query profile. We evaluated algorithm performance in a sample of 2,466 CXR reports interpreted by physician consensus and in two ICU patient subgroups including those admitted for pneumonia and for rheumatologic/endocrine diagnoses.
Most reports were deemed 'negative' (51.8%) by physician consensus. Many were 'possible' (41.7%); only 6.5% were 'positive' for pneumonia. The lexicon included 105 terms and uncertainty profiles that were encoded into 31 NLP queries. Queries identified 534,322 'hits' in the full sample, with 2.7 ± 2.6 'hits' per report. An algorithm, comprised of twenty rules and probability steps, assigned interpretations to reports based on query profiles. In the validation set, the algorithm had 92.7% sensitivity, 91.1% specificity, 93.3% positive predictive value, and 90.3% negative predictive value for differentiating 'negative' from 'positive'/'possible' reports. In the ICU subgroups, the algorithm also demonstrated good performance, misclassifying few reports (5.8%).
Many CXR reports in ICU patients demonstrate frank uncertainty regarding a pneumonia diagnosis. This electronic tool demonstrates promise for assigning automated interpretations to CXR reports by leveraging both terms and uncertainty profiles.
先前的研究表明,自然语言处理(NLP)非常适合识别胸部 X 光(CXR)报告中的肺炎,但很少有研究评估这种方法在重症监护病房(ICU)患者中的应用。
从总共 194615 份 ICU 报告中,我们通过经验建立了一个词汇表来对与肺炎相关的术语和不确定性特征进行分类。我们将词汇表项编码为 NLP 软件应用程序中的独特查询,并设计了一种算法,根据每个报告的查询特征来自动分配解释(“阳性”、“可能”或“阴性”)。我们在由医生共识解释的 2466 份 CXR 报告样本和包括因肺炎和风湿内分泌诊断而入院的两个 ICU 患者亚组中评估了算法性能。
大多数报告被医生共识认为是“阴性”(51.8%)。许多被认为是“可能”(41.7%);只有 6.5%的报告被认为是“阳性”。该词汇表包含 105 个术语和不确定性特征,被编码为 31 个 NLP 查询。查询在整个样本中识别了 534322 个“命中”,每份报告有 2.7±2.6 个“命中”。一个由二十条规则和概率步骤组成的算法根据查询特征来为报告分配解释。在验证集中,该算法对于区分“阴性”和“阳性/可能”报告的灵敏度为 92.7%,特异性为 91.1%,阳性预测值为 93.3%,阴性预测值为 90.3%。在 ICU 亚组中,该算法也表现出了良好的性能,错误分类的报告很少(5.8%)。
许多 ICU 患者的 CXR 报告对肺炎诊断存在明显的不确定性。该电子工具通过利用术语和不确定性特征,为 CXR 报告分配自动解释具有很大的潜力。