Zhang Jing, Silber James I, Mazurowski Maciej A
Department of Radiology, Duke University School of Medicine, Durham, NC, United States; Computer Science Department, Lamar University, Beaumont, TX, United States.
Department of Biomedical Engineering, Duke University Pratt School of Engineering, Durham, NC, United States.
J Biomed Inform. 2015 Apr;54:50-7. doi: 10.1016/j.jbi.2015.01.007. Epub 2015 Jan 30.
While mammography notably contributes to earlier detection of breast cancer, it has its limitations, including a large number of false positive exams. Improved radiology education could potentially contribute to alleviating this issue. Toward this goal, in this paper we propose an algorithm for modeling of false positive error making among radiology trainees. Identifying troublesome locations for the trainees could focus their training and in turn improve their performance.
The algorithm proposed in this paper predicts locations that are likely to result in a false positive error for each trainee based on the previous annotations made by the trainee. The algorithm consists of three steps. First, the suspicious false positive locations are identified in mammograms by Difference of Gaussian filter and suspicious regions are segmented by computer vision-based segmentation algorithms. Second, 133 features are extracted for each suspicious region to describe its distinctive characteristics. Third, a random forest classifier is applied to predict the likelihood of the trainee making a false positive error using the extracted features. The random forest classifier is trained using previous annotations made by the trainee. We evaluated the algorithm using data from a reader study in which 3 experts and 10 trainees interpreted 100 mammographic cases.
The algorithm was able to identify locations where the trainee will commit a false positive error with accuracy higher than an algorithm that selects such locations randomly. Specifically, our algorithm found false positive locations with 40% accuracy when only 1 location was selected for all cases for each trainee and 12% accuracy when 10 locations were selected. The accuracies for randomly identified locations were both 0% for these two scenarios.
In this first study on the topic, we were able to build computer models that were able to find locations for which a trainee will make a false positive error in images that were not previously seen by the trainee. Presenting the trainees with such locations rather than randomly selected ones may improve their educational outcomes.
虽然乳腺钼靶检查对早期发现乳腺癌有显著贡献,但它也有其局限性,包括大量的假阳性检查。改进放射学教育可能有助于缓解这一问题。为了实现这一目标,在本文中,我们提出了一种用于对放射学实习生假阳性错误进行建模的算法。识别实习生容易出错的位置可以使他们的训练更有针对性,进而提高他们的表现。
本文提出的算法基于实习生之前的标注,预测每个实习生可能导致假阳性错误的位置。该算法包括三个步骤。首先,通过高斯差分滤波器在乳腺钼靶图像中识别可疑的假阳性位置,并使用基于计算机视觉的分割算法对可疑区域进行分割。其次,为每个可疑区域提取133个特征来描述其独特特征。第三,应用随机森林分类器,利用提取的特征预测实习生出现假阳性错误的可能性。随机森林分类器使用实习生之前的标注进行训练。我们使用来自一项读者研究的数据对该算法进行了评估,在该研究中,3名专家和10名实习生解读了100例乳腺钼靶病例。
该算法能够识别实习生出现假阳性错误的位置,其准确率高于随机选择此类位置的算法。具体而言,当为每个实习生的所有病例仅选择1个位置时,我们的算法发现假阳性位置的准确率为40%;当选择10个位置时,准确率为12%。在这两种情况下,随机识别位置的准确率均为0%。
在关于该主题的首次研究中,我们能够构建计算机模型,该模型能够在实习生之前未见过的图像中找到实习生会出现假阳性错误的位置。向实习生展示这些位置而非随机选择的位置可能会提高他们的教育效果。