Hu Lvhui, Cheng Xiaoen, Wen Chuanbiao, Ren Yulan
School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China.
Sinology College of Chengdu University of Traditional Chinese Medicine, Chengdu, China.
Front Neurosci. 2023 Jul 13;17:1221970. doi: 10.3389/fnins.2023.1221970. eCollection 2023.
Missing data is a naturally common problem faced in medical research. Imputation is a widely used technique to alleviate this problem. Unfortunately, the inherent uncertainty of imputation would make the model overfit the observed data distribution, which has a negative impact on the model generalization performance. R-Drop is a powerful technique to regularize the training of deep neural networks. However, it fails to differentiate the positive and negative samples, which prevents the model from learning robust representations. To handle this problem, we propose a novel negative regularization enhanced R-Drop scheme to boost performance and generalization ability, particularly in the context of missing data. The negative regularization enhanced R-Drop additionally forces the output distributions of positive and negative samples to be inconsistent with each other. Especially, we design a new max-minus negative sampling technique that uses the maximum in-batch values to minus the mini-batch to yield the negative samples to provide sufficient diversity for the model. We test the resulting max-minus negative regularized dropout method on three real-world medical prediction datasets, including both missing and complete cases, to show the effectiveness of the proposed method.
缺失数据是医学研究中自然会遇到的常见问题。插补是一种广泛用于缓解此问题的技术。不幸的是,插补固有的不确定性会使模型过度拟合观测数据分布,这对模型的泛化性能有负面影响。R-Drop是一种用于正则化深度神经网络训练的强大技术。然而,它无法区分正样本和负样本,这阻碍了模型学习鲁棒的表示。为了解决这个问题,我们提出了一种新颖的负正则化增强R-Drop方案,以提高性能和泛化能力,特别是在缺失数据的情况下。负正则化增强R-Drop额外强制正样本和负样本的输出分布彼此不一致。特别是,我们设计了一种新的最大减最小负采样技术,该技术使用批量中的最大值减去小批量来生成负样本,为模型提供足够的多样性。我们在三个真实世界的医学预测数据集上测试了由此产生的最大减最小负正则化随机失活方法,包括缺失和完整病例,以证明所提出方法的有效性。