Xu Chuhan, Coen-Pirani Pablo, Jiang Xia
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15217, USA.
Cancers (Basel). 2023 Mar 25;15(7):1969. doi: 10.3390/cancers15071969.
Overfitting may affect the accuracy of predicting future data because of weakened generalization. In this research, we used an electronic health records (EHR) dataset concerning breast cancer metastasis to study the overfitting of deep feedforward neural networks (FNNs) prediction models. We studied how each hyperparameter and some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied were activate function, weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch size, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs.
由于泛化能力减弱,过拟合可能会影响对未来数据的预测准确性。在本研究中,我们使用了一个关于乳腺癌转移的电子健康记录(EHR)数据集来研究深度前馈神经网络(FNN)预测模型的过拟合情况。我们研究了每个超参数以及一些有趣的超参数对是如何相互作用以影响模型性能和过拟合的。我们研究的11个超参数分别是激活函数、权重初始化器、隐藏层数、学习率、动量、衰减、随机失活率、批量大小、轮次、L1和L2。我们的结果表明,大多数单个超参数与模型预测性能和过拟合呈负相关或正相关。特别是,我们发现过拟合总体上往往与学习率、衰减、批量大小和L2呈负相关,但与动量、轮次和L1呈正相关。根据我们的结果,学习率、衰减和批量大小可能比大多数其他超参数(包括为最小化过拟合而设计的L1、L2和随机失活率)对过拟合和预测性能的影响更大。我们还发现了一些有趣的超参数对相互作用,如学习率和动量、学习率和衰减以及批量大小和轮次。