Hsu Te-Cheng, Lin Che
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5669-5672. doi: 10.1109/EMBC44109.2020.9175736.
Accurate cancer patient prognosis stratification is essential for oncologists to recommend proper treatment plans. Deep learning models are capable of providing good prediction power for such stratification. The main challenge is that only a limited number of labeled patients are available for cancer prognosis. To overcome this, we proposed Wasserstein Generative Adversarial Network-based Deep Adversarial Data Augmentation (wDADA) that leverages generative adversarial networks to perform data augmentation and assist in model training. We used the proposed framework to train our model for predicting disease-specific survival (DSS) of breast cancer patients from the METABRIC dataset. We found that wDADA achieved 0.6726± 0.0278, 0.7538±0.0328, and 0.6507 ±0.0248 in terms of accuracy, AUC, and concordance index in predicting 5-year DSS, respectively, which is comparable to our previously proposed Bimodal model (accuracy: 0.6889±0.0159; AUC: 0.7546± 0.0183; concordance index: 0.6542±0.0120), which needs careful calibration and extensive search on pre-trained network architectures. The flexibility of the proposed wDADA allows us to incorporate it with ensemble learning and semi-supervised learning to further improve performance. Our results indicate that it is possible to utilize generative adversarial networks to train deep models in medical applications, wherein only limited data are available.
准确的癌症患者预后分层对于肿瘤学家推荐合适的治疗方案至关重要。深度学习模型能够为这种分层提供良好的预测能力。主要挑战在于可用于癌症预后的标记患者数量有限。为了克服这一问题,我们提出了基于瓦瑟斯坦生成对抗网络的深度对抗数据增强(wDADA)方法,该方法利用生成对抗网络进行数据增强并辅助模型训练。我们使用所提出的框架来训练模型,以预测来自METABRIC数据集的乳腺癌患者的疾病特异性生存率(DSS)。我们发现,wDADA在预测5年DSS时,准确率、AUC和一致性指数分别达到了(0.6726\pm0.0278)、(0.7538\pm0.0328)和(0.6507\pm0.0248),这与我们之前提出的双峰模型(准确率:(0.6889\pm0.0159);AUC:(0.7546\pm0.0183);一致性指数:(0.6542\pm0.0120))相当,而双峰模型需要对预训练网络架构进行仔细校准和广泛搜索。所提出的wDADA的灵活性使我们能够将其与集成学习和半监督学习相结合,以进一步提高性能。我们的结果表明,在仅有有限数据可用的医学应用中,利用生成对抗网络训练深度模型是可行的。