D'Or Institute for Research and Education (IDOR), Rio de Janeiro, Brazil.
Radiomics and Augmented Intelligence Laboratory (RAIL), University of Florida, Gainesville, FL.
JCO Glob Oncol. 2023 Sep;9:e2300191. doi: 10.1200/GO.23.00191.
To evaluate the diagnostic performance of a natural language processing (NLP) model in detecting incidental lung nodules (ILNs) in unstructured chest computed tomography (CT) reports.
All unstructured consecutive reports of chest CT scans performed at a tertiary hospital between 2020 and 2021 were retrospectively reviewed (n = 21,542) to train the NLP tool. Internal validation was performed using reference readings by two radiologists of both CT scans and reports, using a different external cohort of 300 chest CT scans. Second, external validation was performed in a cohort of all random unstructured chest CT reports from 57 different hospitals conducted in May 2022. A review by the same thoracic radiologists was used as the gold standard. The sensitivity, specificity, and accuracy were calculated.
Of 21,542 CT reports, 484 mentioned at least one ILN (mean age, 71 ± 17.6 [standard deviation] years; women, 52%) and were included in the training set. In the internal validation (n = 300), the NLP tool detected ILN with a sensitivity of 100.0% (95% CI, 97.6 to 100.0), a specificity of 95.9% (95% CI, 91.3 to 98.5), and an accuracy of 98.0% (95% CI, 95.7 to 99.3). In the external validation (n = 977), the NLP tool yielded a sensitivity of 98.4% (95% CI, 94.5 to 99.8), a specificity of 98.6% (95% CI, 97.5 to 99.3), and an accuracy of 98.6% (95% CI, 97.6 to 99.2). Twelve months after the initial reports, 8 (8.60%) patients had a final diagnosis of lung cancer, among which 2 (2.15%) would have been lost to follow-up without the NLP tool.
NLP can be used to identify ILNs in unstructured reports with high accuracy, allowing a timely recall of patients and a potential diagnosis of early-stage lung cancer that might have been lost to follow-up.
评估自然语言处理(NLP)模型在检测非结构化胸部计算机断层扫描(CT)报告中的偶然肺结节(ILN)的诊断性能。
回顾性分析 2020 年至 2021 年在一家三级医院进行的所有连续非结构化胸部 CT 扫描的非结构化报告(n=21542),以训练 NLP 工具。内部验证使用两位放射科医生对 CT 扫描和报告的参考阅读进行,使用来自 57 家不同医院的 300 例随机非结构化胸部 CT 报告的外部队列进行第二次外部验证。使用同一胸科放射科医生的审查作为金标准。计算了敏感性、特异性和准确性。
在 21542 份 CT 报告中,484 份报告至少提到一个 ILN(平均年龄 71±17.6[标准差]岁;女性占 52%),并被纳入训练集。在内部验证(n=300)中,NLP 工具检测到 ILN 的敏感性为 100.0%(95%CI,97.6 至 100.0),特异性为 95.9%(95%CI,91.3 至 98.5),准确性为 98.0%(95%CI,95.7 至 99.3)。在外部验证(n=977)中,NLP 工具的敏感性为 98.4%(95%CI,94.5 至 99.8),特异性为 98.6%(95%CI,97.5 至 99.3),准确性为 98.6%(95%CI,97.6 至 99.2)。在初始报告后的 12 个月,有 8 名(8.60%)患者最终诊断为肺癌,其中如果没有 NLP 工具,2 名(2.15%)患者将失去随访。
NLP 可用于非结构化报告中准确识别 ILN,从而及时召回患者,并对可能因失访而错过的早期肺癌进行潜在诊断。