Kerley Cailey I, Cai Leon Y, Tang Yucheng, Beason-Held Lori L, Resnick Susan M, Cutting Laurie E, Landman Bennett A
Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA.
Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA.
Proc SPIE Int Soc Opt Eng. 2023 Feb;12464. doi: 10.1117/12.2653643. Epub 2023 Apr 3.
Batch size is a key hyperparameter in training deep learning models. Conventional wisdom suggests larger batches produce improved model performance. Here we present evidence to the contrary, particularly when using autoencoders to derive meaningful latent spaces from data with spatially global similarities and local differences, such as electronic health records (EHR) and medical imaging. We investigate batch size effects in both EHR data from the Baltimore Longitudinal Study of Aging and medical imaging data from the multimodal brain tumor segmentation (BraTS) challenge. We train fully connected and convolutional autoencoders to compress the EHR and imaging input spaces, respectively, into 32-dimensional latent spaces via reconstruction losses for various batch sizes between 1 and 100. Under the same hyperparameter configurations, smaller batches improve loss performance for both datasets. Additionally, latent spaces derived by autoencoders with smaller batches capture more biologically meaningful information. Qualitatively, we visualize 2-dimensional projections of the latent spaces and find that with smaller batches the EHR network better separates the sex of the individuals, and the imaging network better captures the right-left laterality of tumors. Quantitatively, the analogous sex classification and laterality regressions using the latent spaces demonstrate statistically significant improvements in performance at smaller batch sizes. Finally, we find improved individual variation locally in visualizations of representative data reconstructions at lower batch sizes. Taken together, these results suggest that smaller batch sizes should be considered when designing autoencoders to extract meaningful latent spaces among EHR and medical imaging data driven by global similarities and local variation.
批量大小是深度学习模型训练中的一个关键超参数。传统观点认为,更大的批量会提升模型性能。然而,我们在此展示了相反的证据,特别是在使用自动编码器从具有空间全局相似性和局部差异的数据(如电子健康记录(EHR)和医学影像)中提取有意义的潜在空间时。我们研究了来自巴尔的摩纵向衰老研究的EHR数据和多模态脑肿瘤分割(BraTS)挑战中的医学影像数据的批量大小效应。我们训练全连接和卷积自动编码器,通过针对1到100之间的各种批量大小的重建损失,分别将EHR和影像输入空间压缩到32维潜在空间。在相同的超参数配置下,较小的批量能提高两个数据集的损失性能。此外,由较小批量的自动编码器导出的潜在空间捕获了更多具有生物学意义的信息。定性地,我们可视化了潜在空间的二维投影,发现使用较小批量时,EHR网络能更好地分离个体的性别,而影像网络能更好地捕获肿瘤的左右侧性。定量地,使用潜在空间进行的类似性别分类和侧性回归表明,在较小批量大小时性能有统计学上的显著提升。最后,我们发现在较低批量大小下,代表性数据重建的可视化中局部个体差异得到了改善。综上所述,这些结果表明,在设计自动编码器以在由全局相似性和局部变化驱动的EHR和医学影像数据中提取有意义的潜在空间时,应考虑较小的批量大小。