Sessa Francesco, Guardo Elisa, Esposito Massimiliano, Chisari Mario, Di Mauro Lucio, Salerno Monica, Pomara Cristoforo
Department of Medical, Surgical and Advanced Technologies "G.F. Ingrassia", University of Catania, 95121 Catania, Italy.
Faculty of Medicine and Surgery, "Kore" University of Enna, 94100 Enna, Italy.
Diagnostics (Basel). 2025 Aug 20;15(16):2094. doi: 10.3390/diagnostics15162094.
: The integration of artificial intelligence (AI) into forensic science is expanding, yet its application in firearm injury diagnostics remains underexplored. This study investigates the diagnostic capabilities of ChatGPT-4 (February 2024 update) in classifying gunshot wounds, specifically distinguishing entrance from exit wounds, and evaluates its potential, limitations, and forensic applicability. : ChatGPT-4 was tested using three datasets: (1) 36 firearm injury images from an external database, (2) 40 images of intact skin from the forensic archive of the University of Catania (negative control), and (3) 40 real-case firearm injury images from the same archive. The AI's performance was assessed before and after machine learning (ML) training, with classification accuracy evaluated through descriptive and inferential statistics. : ChatGPT-4 demonstrated a statistically significant improvement in identifying entrance wounds post-ML training, with enhanced descriptive accuracy of morphological features. However, its performance in classifying exit wounds remained limited, reflecting challenges noted in forensic literature. The AI showed high accuracy (95%) in distinguishing intact skin from injuries in the negative control analysis. A lack of standardized datasets and contextual forensic information contributed to misclassification, particularly for exit wounds. : While ChatGPT-4 is not yet a substitute for specialized forensic deep learning models, its iterative learning capacity and descriptive improvements suggest potential as a supplementary diagnostic tool in forensic pathology. However, risks such as overconfident misclassifications and AI-generated hallucinations highlight the need for expert oversight and cautious integration in forensic workflows. Future research should prioritize dataset expansion, contextual data integration, and standardized validation protocols to enhance AI reliability in medico-legal diagnostics.
人工智能(AI)在法医学中的应用正在不断扩展,但其在火器伤诊断中的应用仍有待深入探索。本研究调查了ChatGPT-4(2024年2月更新版)在枪伤分类方面的诊断能力,特别是区分入口伤和出口伤,并评估其潜力、局限性和法医学适用性。使用三个数据集对ChatGPT-4进行测试:(1)来自外部数据库的36张火器伤图像;(2)卡塔尼亚大学法医学档案中的40张完整皮肤图像(阴性对照);(3)来自同一档案的40张真实案例火器伤图像。在机器学习(ML)训练前后评估人工智能的性能,通过描述性和推断性统计评估分类准确性。ChatGPT-4在ML训练后识别入口伤方面表现出统计学上的显著改善,形态特征的描述准确性有所提高。然而,其在出口伤分类方面的表现仍然有限,这反映了法医学文献中指出的挑战。在阴性对照分析中,人工智能在区分完整皮肤和损伤方面显示出高准确率(95%)。缺乏标准化数据集和背景法医学信息导致了错误分类,尤其是对于出口伤。虽然ChatGPT-4尚未能替代专门的法医深度学习模型,但其迭代学习能力和描述性改进表明它有潜力作为法医病理学中的辅助诊断工具。然而,过度自信的错误分类和人工智能生成的幻觉等风险凸显了在法医工作流程中需要专家监督和谨慎整合。未来的研究应优先扩大数据集、整合背景数据和标准化验证协议,以提高人工智能在法医学诊断中的可靠性。