Rozwag Clémence, Valentini Franck, Cotten Anne, Demondion Xavier, Preux Philippe, Jacques Thibaut
Université de Lille , Lille, France.
Centre hospitalier universitaire de Lille, Lille, France.
Res Diagn Interv Imaging. 2023 Apr 29;6:100029. doi: 10.1016/j.redii.2023.100029. eCollection 2023 Jun.
To develop a model using artificial intelligence (A.I.) able to detect post-traumatic injuries on pediatric elbow X-rays then to evaluate its performances in silico and its impact on radiologists' interpretation in clinical practice.
A total of 1956 pediatric elbow radiographs performed following a trauma were retrospectively collected from 935 patients aged between 0 and 18 years. Deep convolutional neural networks were trained on these X-rays. The two best models were selected then evaluated on an external test set involving 120 patients, whose X-rays were performed on a different radiological equipment in another time period. Eight radiologists interpreted this external test set without then with the help of the A.I. models .
Two models stood out: model 1 had an accuracy of 95.8% and an AUROC of 0.983 and model 2 had an accuracy of 90.5% and an AUROC of 0.975. On the external test set, model 1 kept a good accuracy of 82.5% and AUROC of 0.916 while model 2 had a loss of accuracy down to 69.2% and of AUROC to 0.793. Model 1 significantly improved radiologist's sensitivity (0.82 to 0.88, = 0.016) and accuracy (0.86 to 0.88, = 0,047) while model 2 significantly decreased specificity of readers (0.86 to 0.83, = 0.031).
End-to-end development of a deep learning model to assess post-traumatic injuries on elbow X-ray in children was feasible and showed that models with close metrics in silico can unpredictably lead radiologists to either improve or lower their performances in clinical settings.
开发一种使用人工智能(A.I.)的模型,该模型能够检测小儿肘部X光片上的创伤后损伤,然后在计算机上评估其性能以及其对临床实践中放射科医生解读的影响。
回顾性收集了935名年龄在0至18岁之间的患者在创伤后拍摄的1956张小儿肘部X光片。在这些X光片上训练深度卷积神经网络。选择两个最佳模型,然后在一个外部测试集上进行评估,该测试集涉及120名患者,他们的X光片是在另一个时间段使用不同的放射设备拍摄的。八位放射科医生在没有然后在人工智能模型的帮助下解读这个外部测试集。
有两个模型脱颖而出:模型1的准确率为95.8%,曲线下面积(AUROC)为0.983;模型2的准确率为90.5%,AUROC为0.975。在外部测试集上,模型1保持了82.5%的良好准确率和0.916的AUROC,而模型2的准确率降至69.2%,AUROC降至0.793。模型1显著提高了放射科医生的敏感度(从0.82提高到0.88,P = 0.016)和准确率(从0.86提高到0.88,P = 0.047),而模型2显著降低了阅片者的特异度(从0.86降低到0.83,P = 0.031)。
深度学习模型用于评估儿童肘部X光片上创伤后损伤的端到端开发是可行的,并且表明在计算机上具有相近指标的模型在临床环境中可能会不可预测地导致放射科医生的表现提高或降低。