McClure Patrick, Kaczmarzyk Jakub R, Ghosh Satrajit S, Bandettini Peter, Zheng Charles Y, Lee John A, Nielson Dylan, Pereira Francisco
National Institute of Mental Health.
Massachusetts Institute of Technology.
Adv Neural Inf Process Syst. 2018 Dec;31:4093-4103.
Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.
收集训练深度神经网络所需的大型数据集可能非常困难,特别是对于许多应用而言,由于实际、伦理或法律问题,数据共享和合并变得很复杂。然而,单个站点内开发的衍生数据集或预测模型可能可以在较少限制的情况下进行共享和合并。在分布式数据上进行训练并合并所得网络通常被视为持续学习,但这些方法要求网络按顺序进行训练。在本文中,我们介绍了分布式权重合并(DWC),这是一种持续学习方法,用于合并分别在独立数据集上训练的单独神经网络的权重。我们通过一个脑部分割案例研究对DWC进行了评估,在该研究中,我们合并了在来自不同站点的独立结构磁共振成像(sMRI)数据集上训练的扩张卷积神经网络。我们发现,与集成基线相比,DWC在不同站点的测试集上提高了性能,同时在一个非常大且完全独立的多站点数据集上保持了泛化性能。