Department of Pediatric Dentistry, School of Dentistry, Kyungpook National University, 41940 Daegu, Republic of Korea.
Department of Orthodontics, School of Dentistry, Kyungpook National University, 41940 Daegu, Republic of Korea.
J Clin Pediatr Dent. 2023 Nov;47(6):130-141. doi: 10.22514/jocpd.2023.087. Epub 2023 Nov 3.
At the current technology level, a human examiner's review must be accompanied to compensate for the insufficient commercial artificial intelligence (AI) performance. This study aimed to investigate the effects of the human examiner's expertise on the efficacy of AI analysis, including time-saving and error reduction. Eighty-four pretreatment cephalograms were randomly selected for this study. First, human examiners (one beginner and two regular examiners) manually detected 15 cephalometric landmarks and measured the required time. Subsequently, commercial AI services automatically identified these landmarks. Finally, the human examiners reviewed the AI landmark determination and adjusted them as needed while measuring the time required for the review process. Then, the elapsed time was compared statistically. Systematic and random errors among examiners (human examiners, AI and their combinations) were assessed using the Bland-Altman analysis. Intraclass correlation coefficients were used to estimate the inter-examiner reliability. No clinically significant time difference was observed regardless of AI use. AI measurement error decreased substantially after the review of the human examiner. From the standpoint of the human examiner, beginners could obtain better results than manual landmarking. However, the AI review outcomes of the regular examiner were not as good as those of manual analysis, possibly due to AI-dependent landmark decisions. The reliability of AI analysis could also be improved by employing the human examiner's review. Although the time-saving effect was not evident, commercial AI cephalometric services are currently recommendable for beginners.
在当前的技术水平下,必须有人类审查员的参与来弥补商业人工智能(AI)性能的不足。本研究旨在探讨人类审查员的专业知识对 AI 分析效果的影响,包括节省时间和减少错误。本研究随机选择了 84 张预处理头颅侧位片。首先,人类审查员(一名初学者和两名常规审查员)手动检测了 15 个颅面标志并记录所需时间。然后,商业 AI 服务自动识别这些标志。最后,人类审查员审查 AI 标志的确定,并在需要时进行调整,同时记录审查过程所需的时间。然后,对耗时进行统计学比较。采用 Bland-Altman 分析评估审查员(人类审查员、AI 及其组合)之间的系统和随机误差。使用组内相关系数估计审查员之间的可靠性。无论是否使用 AI,都没有观察到临床意义上的时间差异。经过人类审查员的审查,AI 测量误差显著降低。从人类审查员的角度来看,初学者可以获得比手动标志定位更好的结果。然而,常规审查员的 AI 审查结果不如手动分析的结果好,这可能是由于 AI 对标志的依赖决策所致。通过审查员的审查,还可以提高 AI 分析的可靠性。虽然节省时间的效果不明显,但目前建议初学者使用商业 AI 头颅测量服务。