Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA.
Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA.
Chest. 2021 Nov;160(5):1902-1914. doi: 10.1016/j.chest.2021.05.048. Epub 2021 Jun 4.
There is an urgent need for population-based studies on managing patients with pulmonary nodules.
Is it possible to identify pulmonary nodules and associated characteristics using an automated method?
We revised and refined an existing natural language processing (NLP) algorithm to identify radiology transcripts with pulmonary nodules and greatly expanded its functionality to identify the characteristics of the largest nodule, when present, including size, lobe, laterality, attenuation, calcification, and edge. We compared NLP results with a reference standard of manual transcript review in a random test sample of 200 radiology transcripts. We applied the final automated method to a larger cohort of patients who underwent chest CT scan in an integrated health care system from 2006 to 2016, and described their demographic and clinical characteristics.
In the test sample, the NLP algorithm had very high sensitivity (98.6%; 95% CI, 95.0%-99.8%) and specificity (100%; 95% CI, 93.9%-100%) for identifying pulmonary nodules. For attenuation, edge, and calcification, the NLP algorithm achieved similar accuracies, and it correctly identified the diameter of the largest nodule in 135 of 141 cases (95.7%; 95% CI, 91.0%-98.4%). In the larger cohort, the NLP found 217,771 reports with nodules among 717,304 chest CT reports (30.4%). From 2006 to 2016, the number of reports with nodules increased by 150%, and the mean size of the largest nodule gradually decreased from 11 to 8.9 mm. Radiologists documented the laterality and lobe (90%-95%) more often than the attenuation, calcification, and edge characteristics (11%-14%).
The NLP algorithm identified pulmonary nodules and associated characteristics with high accuracy. In our community practice settings, the documentation of nodule characteristics is incomplete. Our results call for better documentation of nodule findings. The NLP algorithm can be used in population-based studies to identify pulmonary nodules, avoiding labor-intensive chart review.
目前非常需要进行基于人群的研究,以管理患有肺结节的患者。
是否可以使用自动化方法来识别肺结节及相关特征?
我们对现有的自然语言处理(NLP)算法进行了修订和完善,以识别包含肺结节的放射学转录本,并极大地扩展了其功能,以识别最大结节的特征,若存在最大结节的话,包括大小、叶、侧别、衰减、钙化和边缘。我们在 200 份放射学转录本的随机测试样本中,将 NLP 结果与手动转录本审查的参考标准进行了比较。我们将最终的自动化方法应用于在 2006 年至 2016 年期间在综合医疗保健系统中接受胸部 CT 扫描的更大患者队列,并描述了他们的人口统计学和临床特征。
在测试样本中,NLP 算法对于识别肺结节具有很高的敏感性(98.6%;95%CI,95.0%-99.8%)和特异性(100%;95%CI,93.9%-100%)。对于衰减、边缘和钙化,NLP 算法具有相似的准确性,并且在 141 例中的 135 例(95.7%;95%CI,91.0%-98.4%)中正确识别了最大结节的直径。在更大的队列中,NLP 在 717304 份胸部 CT 报告中发现了 217771 份有结节的报告(30.4%)。从 2006 年至 2016 年,有结节报告的数量增加了 150%,而最大结节的平均大小从 11 毫米逐渐减小至 8.9 毫米。放射科医生记录了侧别和叶(90%-95%)的情况比衰减、钙化和边缘特征(11%-14%)更为频繁。
NLP 算法具有很高的准确性,可以识别肺结节及相关特征。在我们的社区实践环境中,结节特征的记录并不完整。我们的研究结果呼吁更好地记录结节发现。NLP 算法可用于基于人群的研究,以识别肺结节,避免费力的图表审查。