Institut de Biomecanique Humaine Georges Charpak (IBHGC), Arts et Metiers Institute of Technology, Paris, France.
Department of Dento-Facial Orthopedics, Faculty of Dental Surgery, Strasbourg University, Strasbourg, France.
Int J Oral Maxillofac Surg. 2020 Oct;49(10):1367-1378. doi: 10.1016/j.ijom.2020.02.015. Epub 2020 Mar 10.
The aim of this systematic review was to assess the accuracy and reliability of automatic landmarking for cephalometric analysis of three-dimensional craniofacial images. We searched for studies that reported results of automatic landmarking and/or measurements of human head computed tomography or cone beam computed tomography scans in MEDLINE, Embase and Web of Science until March 2019. Two authors independently screened articles for eligibility. Risk of bias and applicability concerns for each included study were assessed using the QUADAS-2 tool. Eleven studies with test dataset sample sizes ranging from 18 to 77 images were included. They used knowledge-, atlas- or learning-based algorithms to landmark two to 33 points of cephalometric interest. Ten studies measured mean localization errors between manually and automatically detected landmarks. Depending on the studies and the landmarks, mean errors ranged from <0.50mm to>5mm. The two best-performing algorithms used a deep learning method and reported mean errors <2mm for every landmark, approximating results of operator variability in manual landmarking. Risk of bias regarding patient selection and implementation of the reference standard were found, therefore the studies might have yielded overoptimistic results. The robustness of these algorithms needs to be more thoroughly tested in challenging clinical settings. PROSPERO registration number: CRD42019119637.
本系统评价的目的是评估三维颅面图像头影测量分析中自动标志定位的准确性和可靠性。我们在 MEDLINE、Embase 和 Web of Science 中检索了截至 2019 年 3 月报告自动标志定位和/或人类头部计算机断层扫描或锥形束计算机断层扫描测量结果的研究。两位作者独立筛选文章的合格性。使用 QUADAS-2 工具评估每个纳入研究的偏倚风险和适用性问题。纳入了 11 项研究,这些研究的测试数据集样本量从 18 到 77 张图像不等。他们使用基于知识、图谱或学习的算法来定位 2 到 33 个感兴趣的头影测量点。10 项研究测量了手动和自动检测地标之间的平均定位误差。根据研究和地标,平均误差从<0.50mm 到>5mm 不等。表现最好的两种算法使用了深度学习方法,并且报告了每个地标<2mm 的平均误差,接近手动地标定位中操作人员变异性的结果。在患者选择和参考标准的实施方面存在偏倚风险,因此这些研究可能得出了过于乐观的结果。需要在具有挑战性的临床环境中更彻底地测试这些算法的稳健性。PROSPERO 注册号:CRD42019119637。