Barhoumi Yassine, Fattah Abdul Hamid, Bouaynaya Nidhal, Moron Fanny, Kim Jinsuh, Fathallah-Shaykh Hassan M, Chahine Rouba A, Sotoudeh Houman
MRIMath, 3473 Birchwood Lane, Birmingham, AL 35243, USA.
Department of Electrical and Computer Science, Rowan University, Glassboro, NJ 08028, USA.
Diagnostics (Basel). 2024 May 21;14(11):1066. doi: 10.3390/diagnostics14111066.
Patients diagnosed with glioblastoma multiforme (GBM) continue to face a dire prognosis. Developing accurate and efficient contouring methods is crucial, as they can significantly advance both clinical practice and research. This study evaluates the AI models developed by MRIMath© for GBM T1c and fluid attenuation inversion recovery (FLAIR) images by comparing their contours to those of three neuro-radiologists using a smart manual contouring platform. The mean overall Sørensen-Dice Similarity Coefficient metric score (DSC) for the post-contrast T1 (T1c) AI was 95%, with a 95% confidence interval (CI) of 93% to 96%, closely aligning with the radiologists' scores. For true positive T1c images, AI segmentation achieved a mean DSC of 81% compared to radiologists' ranging from 80% to 86%. Sensitivity and specificity for T1c AI were 91.6% and 97.5%, respectively. The FLAIR AI exhibited a mean DSC of 90% with a 95% CI interval of 87% to 92%, comparable to the radiologists' scores. It also achieved a mean DSC of 78% for true positive FLAIR slices versus radiologists' scores of 75% to 83% and recorded a median sensitivity and specificity of 92.1% and 96.1%, respectively. The T1C and FLAIR AI models produced mean Hausdorff distances (<5 mm), volume measurements, kappa scores, and Bland-Altman differences that align closely with those measured by radiologists. Moreover, the inter-user variability between radiologists using the smart manual contouring platform was under 5% for T1c and under 10% for FLAIR images. These results underscore the MRIMath© platform's low inter-user variability and the high accuracy of its T1c and FLAIR AI models.
被诊断为多形性胶质母细胞瘤(GBM)的患者仍然面临严峻的预后。开发准确且高效的轮廓勾画方法至关重要,因为它们能极大地推动临床实践和研究。本研究通过使用智能手动轮廓勾画平台,将MRIMath©开发的针对GBM T1c和液体衰减反转恢复(FLAIR)图像的人工智能模型的轮廓与三位神经放射科医生的轮廓进行比较,对这些模型进行评估。对比增强T1(T1c)人工智能的平均总体索伦森 - 戴斯相似系数度量得分(DSC)为95%,95%置信区间(CI)为93%至96%,与放射科医生的得分紧密对齐。对于真阳性T1c图像,人工智能分割的平均DSC为81%,而放射科医生的范围为80%至86%。T1c人工智能的敏感性和特异性分别为91.6%和97.5%。FLAIR人工智能的平均DSC为90%,95% CI区间为87%至92%,与放射科医生的得分相当。对于真阳性FLAIR切片,它的平均DSC为78%,而放射科医生的得分是75%至83%,并且记录的中位敏感性和特异性分别为92.1%和96.1%。T1C和FLAIR人工智能模型产生的平均豪斯多夫距离(<5毫米)、体积测量值、kappa得分以及布兰德 - 奥特曼差异与放射科医生测量的结果紧密对齐。此外,使用智能手动轮廓勾画平台的放射科医生之间的用户间变异性对于T1c图像低于5%,对于FLAIR图像低于10%。这些结果强调了MRIMath©平台的低用户间变异性及其T1c和FLAIR人工智能模型的高精度。