Fabijan Artur, Zawadzka-Fabijan Agnieszka, Fabijan Robert, Zakrzewski Krzysztof, Nowosławska Emilia, Polis Bartosz
Department of Neurosurgery, Polish-Mother's Memorial Hospital Research Institute, 93-338 Lodz, Poland.
Department of Rehabilitation Medicine, Faculty of Health Sciences, Medical University of Lodz, 90-419 Lodz, Poland.
Diagnostics (Basel). 2024 Apr 5;14(7):773. doi: 10.3390/diagnostics14070773.
Open-source artificial intelligence models (OSAIM) find free applications in various industries, including information technology and medicine. Their clinical potential, especially in supporting diagnosis and therapy, is the subject of increasingly intensive research. Due to the growing interest in artificial intelligence (AI) for diagnostic purposes, we conducted a study evaluating the capabilities of AI models, including ChatGPT and Microsoft Bing, in the diagnosis of single-curve scoliosis based on posturographic radiological images. Two independent neurosurgeons assessed the degree of spinal deformation, selecting 23 cases of severe single-curve scoliosis. Each posturographic image was separately implemented onto each of the mentioned platforms using a set of formulated questions, starting from 'What do you see in the image?' and ending with a request to determine the Cobb angle. In the responses, we focused on how these AI models identify and interpret spinal deformations and how accurately they recognize the direction and type of scoliosis as well as vertebral rotation. The Intraclass Correlation Coefficient (ICC) with a 'two-way' model was used to assess the consistency of Cobb angle measurements, and its confidence intervals were determined using the F test. Differences in Cobb angle measurements between human assessments and the AI ChatGPT model were analyzed using metrics such as RMSEA, MSE, MPE, MAE, RMSLE, and MAPE, allowing for a comprehensive assessment of AI model performance from various statistical perspectives. The ChatGPT model achieved 100% effectiveness in detecting scoliosis in X-ray images, while the Bing model did not detect any scoliosis. However, ChatGPT had limited effectiveness (43.5%) in assessing Cobb angles, showing significant inaccuracy and discrepancy compared to human assessments. This model also had limited accuracy in determining the direction of spinal curvature, classifying the type of scoliosis, and detecting vertebral rotation. Overall, although ChatGPT demonstrated potential in detecting scoliosis, its abilities in assessing Cobb angles and other parameters were limited and inconsistent with expert assessments. These results underscore the need for comprehensive improvement of AI algorithms, including broader training with diverse X-ray images and advanced image processing techniques, before they can be considered as auxiliary in diagnosing scoliosis by specialists.
开源人工智能模型(OSAIM)在包括信息技术和医学在内的各个行业都有广泛应用。其临床潜力,尤其是在辅助诊断和治疗方面,正成为越来越深入研究的主题。由于对用于诊断目的的人工智能(AI)兴趣日益浓厚,我们开展了一项研究,评估包括ChatGPT和微软必应在内的人工智能模型基于姿势造影放射图像诊断单曲线脊柱侧弯的能力。两名独立的神经外科医生评估脊柱变形程度,挑选出23例严重单曲线脊柱侧弯病例。使用一组既定问题,从“你在图像中看到了什么?”开始,到要求确定Cobb角结束,将每张姿势造影图像分别应用于上述每个平台。在回复中,我们关注这些人工智能模型如何识别和解释脊柱变形,以及它们在识别脊柱侧弯的方向和类型以及椎体旋转方面的准确程度。使用具有“双向”模型的组内相关系数(ICC)来评估Cobb角测量的一致性,并使用F检验确定其置信区间。使用RMSEA、MSE、MPE、MAE、RMSLE和MAPE等指标分析人类评估与人工智能ChatGPT模型之间Cobb角测量的差异,从而从各种统计角度全面评估人工智能模型的性能。ChatGPT模型在检测X射线图像中的脊柱侧弯方面达到了100%的有效性,而必应模型未检测到任何脊柱侧弯。然而,ChatGPT在评估Cobb角方面有效性有限(43.5%),与人类评估相比显示出显著的不准确和差异。该模型在确定脊柱弯曲方向、分类脊柱侧弯类型和检测椎体旋转方面的准确性也有限。总体而言,尽管ChatGPT在检测脊柱侧弯方面显示出潜力,但其在评估Cobb角和其他参数方面的能力有限,且与专家评估不一致。这些结果强调,在人工智能算法可被专家视为脊柱侧弯诊断辅助工具之前,需要对其进行全面改进,包括使用更多样化的X射线图像进行更广泛的训练以及采用先进的图像处理技术。