Department of Neurosurgery, Georgetown University School of Medicine, Washington , District of Columbia, USA.
Department of Neurosurgery, Keck School of Medicine of University of Southern California, Los Angeles , California , USA.
Oper Neurosurg (Hagerstown). 2023 Dec 1;25(6):e330-e337. doi: 10.1227/ons.0000000000000888. Epub 2023 Sep 1.
Assessment and feedback are critical to surgical education, but direct observational feedback by experts is rarely provided because of time constraints and is typically only qualitative. Automated, video-based, quantitative feedback on surgical performance could address this gap, improving surgical training. The authors aim to demonstrate the ability of Shannon entropy (ShEn), an information theory metric that quantifies series diversity, to predict surgical performance using instrument detections generated through deep learning.
Annotated images from a publicly available video data set of surgeons managing endoscopic endonasal carotid artery lacerations in a perfused cadaveric simulator were collected. A deep learning model was implemented to detect surgical instruments across video frames. ShEn score for the instrument sequence was calculated from each surgical trial. Logistic regression using ShEn was used to predict hemorrhage control success.
ShEn scores and instrument usage patterns differed between successful and unsuccessful trials (ShEn: 0.452 vs 0.370, P < .001). Unsuccessful hemorrhage control trials displayed lower entropy and less varied instrument use patterns. By contrast, successful trials demonstrated higher entropy with more diverse instrument usage and consistent progression in instrument utilization. A logistic regression model using ShEn scores (78% accuracy and 97% average precision) was at least as accurate as surgeons' attending/resident status and years of experience for predicting trial success and had similar accuracy as expert human observers.
ShEn score offers a summative signal about surgeon performance and predicted success at controlling carotid hemorrhage in a simulated cadaveric setting. Future efforts to generalize ShEn to additional surgical scenarios can further validate this metric.
评估和反馈对于外科教育至关重要,但由于时间限制,专家很少提供直接观察反馈,且通常只是定性的。基于自动化视频的、对手术表现的定量反馈可以弥补这一差距,从而改进外科培训。作者旨在展示香农熵(ShEn)的能力,这是一种量化系列多样性的信息论度量,可以通过深度学习生成的仪器检测来预测手术表现。
收集了一个公开的视频数据集,该数据集来自在灌注尸体模拟器中管理内镜经鼻颈动脉切开术的外科医生的注释图像。实施了一个深度学习模型以在视频帧中检测手术器械。从每个手术试验中计算仪器序列的 ShEn 评分。使用 ShEn 的逻辑回归用于预测出血控制成功。
成功和不成功试验之间的 ShEn 评分和仪器使用模式存在差异(ShEn:0.452 与 0.370,P<0.001)。不成功的出血控制试验显示出较低的熵和较少变化的仪器使用模式。相比之下,成功的试验表现出更高的熵,具有更多样化的仪器使用和一致的仪器利用进展。使用 ShEn 评分的逻辑回归模型(78%的准确率和 97%的平均精度)至少与外科医生的主治/住院医师身份和经验年限一样准确,可以预测试验成功,并且与专家人类观察者的准确性相当。
ShEn 评分提供了关于外科医生表现的总结信号,并预测了在模拟尸体环境中控制颈动脉出血的成功。未来努力将 ShEn 推广到其他手术场景可以进一步验证该指标。