Dominguez Miguel, Wolf Julie Ryan, Prasad Paritosh, Enbiale Wendemagegn, Gottlieb Michael, Berdahl Carl T, Papier Art
VisualDx.
University of Rochester Medical Center.
AMIA Annu Symp Proc. 2025 May 22;2024:368-377. eCollection 2024.
Collecting images of rare dermatological diseases for machine learning detection applications is a costly, laborious task. It is difficult to collect enough images of these diagnoses to avoid the risk of low accuracy "in the wild". One of the sources of bias in these networks is irrelevant background pixel data. These pixels necessarily have no clinical significance, yet Deep Neural Networks will make weak correlations based on that information. To reduce their ability to do this, we introduce a masking augmentation algorithm, InfoMax-Cutout. It employs unsupervised Information Maximization losses to mask out background pixels. InfoMax-Cutout increased accuracy on classifying 319 diagnoses by 0.76%. These features generalized to an unseen diagnosis task (Fitzpatrick 17k), improving accuracy over a baseline by 43.3% and reducing Gini inequality by 20.9%. This approach of learning to separate out background pixels can increase accuracy in detecting diseases in Lower and Middle Income Countries.
为机器学习检测应用收集罕见皮肤病图像是一项成本高昂且费力的任务。很难收集到足够的这些诊断图像,以避免在“自然环境”中出现低准确率的风险。这些网络中偏差的来源之一是无关的背景像素数据。这些像素必然没有临床意义,但深度神经网络会基于这些信息建立微弱的相关性。为了降低它们这样做的能力,我们引入了一种掩码增强算法,即InfoMax-Cutout。它采用无监督信息最大化损失来屏蔽背景像素。InfoMax-Cutout将319种诊断的分类准确率提高了0.76%。这些特征推广到了一个未见过的诊断任务(Fitzpatrick 17k),使准确率比基线提高了43.3%,基尼不平等降低了20.9%。这种学习分离背景像素的方法可以提高低收入和中等收入国家疾病检测的准确率。