Cai Xinjian, Zhan Lili, Lin Yiteng
Department of Clinical Laboratory Medicine, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China.
Digit Health. 2024 Nov 5;10:20552076241298503. doi: 10.1177/20552076241298503. eCollection 2024 Jan-Dec.
To evaluate the accuracy and clinical utility of GPT-4O in recognizing abnormal blood cell morphology, a critical component of hematologic diagnostics.
GPT-4O's blood cell morphology recognition capabilities were assessed by comparing its performance with hematologists. A total of 70 images from the Chinese National Center for Clinical Laboratories, External Quality Assessment (EQA) from 2022 to 2024 were analyzed. Two experienced hematology experts evaluated GPT-4O's recognition accuracy using a Likert scale.
GPT-4O achieved an overall accuracy of 70% in blood cell morphology recognition, significantly lower than the 95.42% accuracy of hematologists (p < 0.05). For peripheral blood smears and bone marrow smears, GPT-4O's accuracy was 77.14% and 62.86% respectively. Likert scale evaluations revealed further discrepancies, with GPT-4O scoring 288.50 out of 350, compared to higher manual scores. GPT-4O accurately recognized certain intracellular inclusions such as Howell-Jolly bodies and Auer rods, while it misidentified fragmented red blood cells as neutrophilic metamyelocytes and oval-shaped red blood cells as sickle cells. Additionally, GPT-4O had difficulty accurately identifying intracellular granules and distinguishing cell nuclei and cytoplasm.
GPT-4O's performance in recognizing abnormal blood cell morphology is currently inadequate compared to hematologists. Despite its potential as a supplementary tool, significant improvements in its recognition algorithms and an expanded dataset are necessary for it to be reliable for clinical use. Future research should focus on enhancing GPT-4O's diagnostic accuracy and addressing its current limitations.
评估GPT-4O在识别异常血细胞形态方面的准确性和临床实用性,这是血液学诊断的关键组成部分。
通过将GPT-4O的表现与血液科医生进行比较,评估其血细胞形态识别能力。分析了来自中国国家临床检验中心2022年至2024年外部质量评估(EQA)的总共70张图像。两位经验丰富的血液学专家使用李克特量表评估GPT-4O的识别准确性。
GPT-4O在血细胞形态识别方面的总体准确率为70%,显著低于血液科医生95.42%的准确率(p<0.05)。对于外周血涂片和骨髓涂片,GPT-4O的准确率分别为77.14%和62.86%。李克特量表评估显示了进一步的差异,GPT-4O在350分中得分为288.50分,而人工评分更高。GPT-4O能够准确识别某些细胞内包涵体,如豪-焦小体和奥氏小体,而它将破碎红细胞误识别为中性晚幼粒细胞,将椭圆形红细胞误识别为镰状细胞。此外,GPT-4O在准确识别细胞内颗粒以及区分细胞核和细胞质方面存在困难。
与血液科医生相比,GPT-4O目前在识别异常血细胞形态方面的表现不足。尽管它有作为辅助工具的潜力,但要使其在临床使用中可靠,其识别算法需要显著改进,数据集也需要扩大。未来的研究应侧重于提高GPT-4O的诊断准确性并解决其当前的局限性。