Ostrovsky Adam M
Sidney Kimmel Medical College at Thomas Jefferson University, Philadelphia, PA, USA.
Am J Emerg Med. 2025 Jul;93:99-102. doi: 10.1016/j.ajem.2025.03.060. Epub 2025 Mar 27.
The rapid advancement of artificial intelligence (AI) has great ability to impact healthcare. Chest X-rays are essential for diagnosing acute thoracic conditions in the emergency department (ED), but interpretation delays due to radiologist availability can impact clinical decision-making. AI models, including deep learning algorithms, have been explored for diagnostic support, but the potential of large language models (LLMs) in emergency radiology remains largely unexamined.
This study assessed ChatGPT's feasibility in interpreting chest X-rays for acute thoracic conditions commonly encountered in the ED. A subset of 1400 images from the NIH Chest X-ray dataset was analyzed, representing seven pathology categories: Atelectasis, Effusion, Emphysema, Pneumothorax, Pneumonia, Mass, and No Finding. ChatGPT 4.0, utilizing the "X-Ray Interpreter" add-on, was evaluated for its diagnostic performance across these categories.
ChatGPT demonstrated high performance in identifying normal chest X-rays, with a sensitivity of 98.9 %, specificity of 93.9 %, and accuracy of 94.7 %. However, the model's performance varied across pathologies. The best results were observed in diagnosing pneumonia (sensitivity 76.2 %, specificity 93.7 %) and pneumothorax (sensitivity 77.4 %, specificity 89.1 %), while performance for atelectasis and emphysema was lower.
ChatGPT demonstrates potential as a supplementary tool for differentiating normal from abnormal chest X-rays, with promising results for certain pathologies like pneumonia. However, its diagnostic accuracy for more subtle conditions requires improvement. Further research integrating ChatGPT with specialized image recognition models could enhance its performance, offering new possibilities in medical imaging and education.
人工智能(AI)的快速发展对医疗保健具有巨大影响。胸部X光对于急诊科(ED)诊断急性胸部疾病至关重要,但由于放射科医生人手不足导致的解读延迟会影响临床决策。人们已经探索了包括深度学习算法在内的人工智能模型用于诊断支持,但大语言模型(LLMs)在急诊放射学中的潜力在很大程度上仍未得到研究。
本研究评估了ChatGPT在解读急诊科常见急性胸部疾病胸部X光方面的可行性。分析了美国国立医学图书馆(NIH)胸部X光数据集中的1400张图像子集,代表七种病理类别:肺不张、胸腔积液、肺气肿、气胸、肺炎、肿块和未见异常。利用“X光解读器”插件的ChatGPT 4.0针对这些类别评估其诊断性能。
ChatGPT在识别正常胸部X光方面表现出高性能,灵敏度为98.9%,特异度为93.9%,准确率为94.7%。然而,该模型在不同病理情况下的表现有所不同。在诊断肺炎(灵敏度76.2%,特异度93.7%)和气胸(灵敏度77.4%,特异度89.1%)方面观察到最佳结果,而肺不张和肺气肿的表现较低。
ChatGPT作为区分正常与异常胸部X光的辅助工具具有潜力,对肺炎等某些病理情况有良好结果。然而,其对更细微病症的诊断准确性需要提高。将ChatGPT与专门的图像识别模型相结合的进一步研究可以提高其性能,为医学成像和教育提供新的可能性。