Jeon Minkyu, Park Hyeonjin, Kim Hyunwoo J, Morley Michael, Cho Hyunghoon
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Korea University, Seoul, Republic of Korea.
Comput Vis ECCV. 2022 Oct;13681:661-678. doi: 10.1007/978-3-031-19803-8_39. Epub 2022 Oct 23.
The application of modern machine learning to retinal image analyses offers valuable insights into a broad range of human health conditions beyond ophthalmic diseases. Additionally, data sharing is key to fully realizing the potential of machine learning models by providing a rich and diverse collection of training data. However, the personallyidentifying nature of retinal images, encompassing the unique vascular structure of each individual, often prevents this data from being shared openly. While prior works have explored image de-identification strategies based on synthetic averaging of images in other domains (e.g. facial images), existing techniques face difficulty in preserving both privacy and clinical utility in retinal images, as we demonstrate in our work. We therefore introduce -SALSA, a generative adversarial network (GAN)-based framework for synthesizing retinal fundus images that summarize a given private dataset while satisfying the privacy notion of -anonymity. -SALSA brings together state-of-the-art techniques for training and inverting GANs to achieve practical performance on retinal images. Furthermore, -SALSA leverages a new technique, called local style alignment, to generate a synthetic average that maximizes the retention of fine-grain visual patterns in the source images, thus improving the clinical utility of the generated images. On two benchmark datasets of diabetic retinopathy (EyePACS and APTOS), we demonstrate our improvement upon existing methods with respect to image fidelity, classification performance, and mitigation of membership inference attacks. Our work represents a step toward broader sharing of retinal images for scientific collaboration. Code is available at https://github.com/hcholab/k-salsa.
将现代机器学习应用于视网膜图像分析,可为眼科疾病以外的广泛人类健康状况提供有价值的见解。此外,数据共享是充分发挥机器学习模型潜力的关键,因为它能提供丰富多样的训练数据。然而,视网膜图像包含每个人独特的血管结构,具有个人身份识别性,这常常阻碍此类数据的公开共享。虽然先前的研究探索了基于其他领域(如面部图像)图像合成平均的去识别策略,但正如我们在工作中所展示的,现有技术在保护视网膜图像的隐私和临床实用性方面面临困难。因此,我们引入了-SALSA,这是一种基于生成对抗网络(GAN)的框架,用于合成视网膜眼底图像,该框架在满足-匿名性的隐私概念的同时,总结给定的私有数据集。-SALSA整合了训练和反转GAN的先进技术,以在视网膜图像上实现实际性能。此外,-SALSA利用一种名为局部风格对齐的新技术来生成合成平均值,从而最大限度地保留源图像中的细粒度视觉模式,进而提高生成图像的临床实用性。在糖尿病视网膜病变的两个基准数据集(EyePACS和APTOS)上,我们展示了相对于现有方法在图像保真度、分类性能和缓解成员推理攻击方面的改进。我们的工作朝着更广泛地共享视网膜图像以进行科学合作迈出了一步。代码可在https://github.com/hcholab/k-salsa获取。