Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, MD 20892-1182, USA.
Radiology. 2012 Mar;262(3):824-33. doi: 10.1148/radiol.11110938. Epub 2012 Jan 24.
To assess the diagnostic performance of distributed human intelligence for the classification of polyp candidates identified with computer-aided detection (CAD) for computed tomographic (CT) colonography.
This study was approved by the institutional Office of Human Subjects Research. The requirement for informed consent was waived for this HIPAA-compliant study. CT images from 24 patients, each with at least one polyp of 6 mm or larger, were analyzed by using CAD software to identify 268 polyp candidates. Twenty knowledge workers (KWs) from a crowdsourcing platform labeled each polyp candidate as a true or false polyp. Two trials involving 228 KWs were conducted to assess reproducibility. Performance was assessed by comparing the area under the receiver operating characteristic curve (AUC) of KWs with the AUC of CAD for polyp classification.
The detection-level AUC for KWs was 0.845 ± 0.045 (standard error) in trial 1 and 0.855 ± 0.044 in trial 2. These were not significantly different from the AUC for CAD, which was 0.859 ± 0.043. When polyp candidates were stratified by difficulty, KWs performed better than CAD on easy detections; AUCs were 0.951 ± 0.032 in trial 1, 0.966 ± 0.027 in trial 2, and 0.877 ± 0.048 for CAD (P = .039 for trial 2). KWs who participated in both trials showed a significant improvement in performance going from trial 1 to trial 2; AUCs were 0.759 ± 0.052 in trial 1 and 0.839 ± 0.046 in trial 2 (P = .041).
The performance of distributed human intelligence is not significantly different from that of CAD for colonic polyp classification.
评估分布式人类智能在识别计算机辅助检测 (CAD) 识别的结直肠 CT 结肠成像中息肉候选物的分类中的诊断性能。
本研究获得了机构人体研究办公室的批准。这项符合 HIPAA 标准的研究免除了知情同意的要求。使用 CAD 软件分析了 24 名患者的 CT 图像,每位患者至少有一个 6 毫米或更大的息肉,共识别出 268 个息肉候选物。来自众包平台的 20 名知识工作者 (KW) 对每个息肉候选物进行标记,标记为真息肉或假息肉。进行了两次涉及 228 名 KWs 的试验以评估可重复性。通过比较 KWs 的接收者操作特性曲线 (ROC) 下面积 (AUC) 与 CAD 对息肉分类的 AUC 来评估性能。
在第一次试验中,KW 的检测水平 AUC 为 0.845 ± 0.045(标准误差),在第二次试验中为 0.855 ± 0.044。这些与 CAD 的 AUC 0.859 ± 0.043 没有显著差异。当按难度对息肉候选物进行分层时,KW 在简单检测中表现优于 CAD;第一次试验 AUC 为 0.951 ± 0.032,第二次试验 AUC 为 0.966 ± 0.027,CAD 为 0.877 ± 0.048(第二次试验 P =.039)。参与两次试验的 KWs 从第一次试验到第二次试验表现出显著的提高;第一次试验 AUC 为 0.759 ± 0.052,第二次试验 AUC 为 0.839 ± 0.046(P =.041)。
分布式人类智能的性能与 CAD 对结直肠息肉分类的性能没有显著差异。