Hasan Eeshan, Eichbaum Quentin, Seegmiller Adam C, Stratton Charles, Trueblood Jennifer S
Department of Psychology, Vanderbilt University.
Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center.
Top Cogn Sci. 2022 Apr;14(2):400-413. doi: 10.1111/tops.12588. Epub 2021 Dec 5.
Improving the accuracy of medical image interpretation can improve the diagnosis of numerous diseases. We compared different approaches to aggregating repeated decisions about medical images to improve the accuracy of a single decision maker. We tested our algorithms on data from both novices (undergraduates) and experts (medical professionals). Participants viewed images of white blood cells and made decisions about whether the cells were cancerous or not. Each image was shown twice to the participants and their corresponding confidence judgments were collected. The maximum confidence slating (MCS) algorithm leverages metacognitive abilities to consider the more confident response in the pair of responses as the more accurate "final response" (Koriat, 2012), and it has previously been shown to improve accuracy on our task for both novices and experts (Hasan et al., 2021). We compared MCS to similarity-based aggregation (SBA) algorithms where the responses made by the same participant on similar images are pooled together to generate the "final response." We determined similarity by using two different neural networks where one of the networks had been trained on white blood cells and the other had not. We show that SBA improves performance for novices even when the neural network had no specific training on white blood cell images. Using an informative representation (i.e., network trained on white blood cells) allowed one to aggregate over more neighbors and further boosted the performance of novices. However, SBA failed to improve the performance for experts even with the informative representation. This difference in efficacy of the SBA suggests different decision mechanisms for novices and experts.
提高医学图像解读的准确性有助于改善多种疾病的诊断。我们比较了不同的方法来汇总对医学图像的重复判断,以提高单个决策者的准确性。我们在新手(本科生)和专家(医学专业人员)的数据上测试了我们的算法。参与者查看白细胞图像,并判断细胞是否癌变。每张图像向参与者展示两次,并收集他们相应的置信度判断。最大置信度排序(MCS)算法利用元认知能力,将一对响应中更有信心的响应视为更准确的“最终响应”(科里亚特,2012年),此前已证明该算法可提高新手和专家在我们任务中的准确性(哈桑等人,2021年)。我们将MCS与基于相似性的汇总(SBA)算法进行了比较,在SBA算法中,同一参与者对相似图像做出的响应被汇总在一起以生成“最终响应”。我们使用两个不同的神经网络来确定相似性,其中一个网络在白细胞上进行了训练,另一个则没有。我们表明,即使神经网络没有对白细胞图像进行特定训练,SBA也能提高新手的表现。使用信息性表示(即在白细胞上训练的网络)可以让人们汇总更多的邻居信息,并进一步提高新手的表现。然而,即使使用信息性表示,SBA也未能提高专家的表现。SBA在效果上的这种差异表明新手和专家的决策机制不同。