University Hospital Bonn, Department of Ophthalmology, Bonn, Germany.
Microsoft Research, Bengaluru, India.
Transl Vis Sci Technol. 2024 Apr 2;13(4):20. doi: 10.1167/tvst.13.4.20.
The purpose of this study was to assess the current use and reliability of artificial intelligence (AI)-based algorithms for analyzing cataract surgery videos.
A systematic review of the literature about intra-operative analysis of cataract surgery videos with machine learning techniques was performed. Cataract diagnosis and detection algorithms were excluded. Resulting algorithms were compared, descriptively analyzed, and metrics summarized or visually reported. The reproducibility and reliability of the methods and results were assessed using a modified version of the Medical Image Computing and Computer-Assisted (MICCAI) checklist.
Thirty-eight of the 550 screened studies were included, 20 addressed the challenge of instrument detection or tracking, 9 focused on phase discrimination, and 8 predicted skill and complications. Instrument detection achieves an area under the receiver operator characteristic curve (ROC AUC) between 0.976 and 0.998, instrument tracking an mAP between 0.685 and 0.929, phase recognition an ROC AUC between 0.773 and 0.990, and complications or surgical skill performs with an ROC AUC between 0.570 and 0.970.
The studies showed a wide variation in quality and pose a challenge regarding replication due to a small number of public datasets (none for manual small incision cataract surgery) and seldom published source code. There is no standard for reported outcome metrics and validation of the models on external datasets is rare making comparisons difficult. The data suggests that tracking of instruments and phase detection work well but surgical skill and complication recognition remains a challenge for deep learning.
This overview of cataract surgery analysis with AI models provides translational value for improving training of the clinician by identifying successes and challenges.
本研究旨在评估当前基于人工智能(AI)的算法在分析白内障手术视频中的使用情况和可靠性。
对使用机器学习技术分析白内障手术视频的术中分析的文献进行了系统回顾。排除了白内障诊断和检测算法。对得到的算法进行了比较、描述性分析,并对指标进行了总结或直观报告。使用修改后的医学图像计算和计算机辅助(MICCAI)清单评估了方法和结果的可重复性和可靠性。
在筛选出的 550 篇研究中,有 38 篇被纳入,其中 20 篇涉及器械检测或跟踪挑战,9 篇专注于相位识别,8 篇预测技能和并发症。器械检测的受试者工作特征曲线(ROC AUC)在 0.976 至 0.998 之间,器械跟踪的平均精度(mAP)在 0.685 至 0.929 之间,相位识别的 ROC AUC 在 0.773 至 0.990 之间,并发症或手术技能的 ROC AUC 在 0.570 至 0.970 之间。
这些研究表明,由于公共数据集数量较少(无手动小切口白内障手术数据集)且很少发布源代码,因此在质量方面存在很大差异,并且在复制方面具有挑战性。报告的结果指标没有标准,很少对外部数据集验证模型,这使得比较变得困难。数据表明,器械跟踪和相位检测效果良好,但深度学习在手术技能和并发症识别方面仍面临挑战。
翻译后内容与原文存在一定差异,我将尽力保证译文的流畅和准确。