DuMont Schütte August, Hetzel Jürgen, Gatidis Sergios, Hepp Tobias, Dietz Benedikt, Bauer Stefan, Schwab Patrick
ETH Zurich, Zurich, Switzerland.
Max Planck Institute for Intelligent Systems, Tübingen, Germany.
NPJ Digit Med. 2021 Sep 24;4(1):141. doi: 10.1038/s41746-021-00507-3.
Privacy concerns around sharing personally identifiable information are a major barrier to data sharing in medical research. In many cases, researchers have no interest in a particular individual's information but rather aim to derive insights at the level of cohorts. Here, we utilise generative adversarial networks (GANs) to create medical imaging datasets consisting entirely of synthetic patient data. The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information. We assess the quality of synthetic data generated by two GAN models for chest radiographs with 14 radiology findings and brain computed tomography (CT) scans with six types of intracranial haemorrhages. We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset. We find that synthetic data performance disproportionately benefits from a reduced number of classes. Our benchmark also indicates that at low numbers of samples per class, label overfitting effects start to dominate GAN training. We conducted a reader study in which trained radiologists discriminate between synthetic and real images. In accordance with our benchmark results, the classification accuracy of radiologists improves with an increasing resolution. Our study offers valuable guidelines and outlines practical conditions under which insights derived from synthetic images are similar to those that would have been derived from real data. Our results indicate that synthetic data sharing may be an attractive alternative to sharing real patient-level data in the right setting.
围绕共享个人身份信息的隐私担忧是医学研究数据共享的一大障碍。在许多情况下,研究人员对特定个人的信息并不感兴趣,而是旨在从队列层面得出见解。在此,我们利用生成对抗网络(GAN)来创建完全由合成患者数据组成的医学影像数据集。理想情况下,合成图像总体上具有与源数据集相似的统计特性,但不包含敏感的个人信息。我们评估了两种GAN模型生成的合成数据的质量,一种用于有14种放射学发现的胸部X光片,另一种用于有六种颅内出血类型的脑部计算机断层扫描(CT)。我们通过在合成数据集或真实数据集上训练的预测模型的性能差异来衡量合成图像质量。我们发现合成数据性能从减少的类别数量中受益过多。我们的基准测试还表明,在每个类别样本数量较少时,标签过拟合效应开始主导GAN训练。我们进行了一项读者研究,让训练有素的放射科医生区分合成图像和真实图像。根据我们的基准测试结果,放射科医生的分类准确率随着分辨率的提高而提高。我们的研究提供了有价值的指导方针,并概述了实际条件,在这些条件下,从合成图像得出的见解与从真实数据得出的见解相似。我们的结果表明,在合适的环境下,合成数据共享可能是共享真实患者层面数据的一个有吸引力的替代方案。