Department of Computer Science, King Abdulaziz University, 21589, Jeddah, Saudi Arabia.
Department of Computer Science, Albaha University, 65799, Albaha, Saudi Arabia.
Sci Rep. 2024 Feb 24;14(1):4491. doi: 10.1038/s41598-024-54923-y.
Accurate deep learning (DL) models to predict type 2 diabetes (T2D) are concerned not only with targeting the discrimination task but also with learning useful feature representation. However, existing DL tools are far from perfect and do not provide appropriate interpretation as a guideline to explain and promote superior performance in the target task. Therefore, we provide an interpretable approach for our presented deep transfer learning (DTL) models to overcome such drawbacks, working as follows. We utilize several pre-trained models including SEResNet152, and SEResNeXT101. Then, we transfer knowledge from pre-trained models via keeping the weights in the convolutional base (i.e., feature extraction part) while modifying the classification part with the use of Adam optimizer to deal with classifying healthy controls and T2D based on single-cell gene regulatory network (SCGRN) images. Another DTL models work in a similar manner but just with keeping weights of the bottom layers in the feature extraction unaltered while updating weights of consecutive layers through training from scratch. Experimental results on the whole 224 SCGRN images using five-fold cross-validation show that our model (TFeSEResNeXT101) achieving the highest average balanced accuracy (BAC) of 0.97 and thereby significantly outperforming the baseline that resulted in an average BAC of 0.86. Moreover, the simulation study demonstrated that the superiority is attributed to the distributional conformance of model weight parameters obtained with Adam optimizer when coupled with weights from a pre-trained model.
准确的深度学习 (DL) 模型来预测 2 型糖尿病 (T2D),不仅关注于目标判别任务,还关注于学习有用的特征表示。然而,现有的 DL 工具远非完美,并且没有提供适当的解释作为指导,以解释和促进目标任务中的优异性能。因此,我们提供了一种可解释的方法,用于我们提出的深度迁移学习 (DTL) 模型,以克服这些缺点,具体方法如下。我们利用了几个预训练模型,包括 SEResNet152 和 SEResNeXT101。然后,我们通过保留卷积基(即特征提取部分)中的权重,同时使用 Adam 优化器修改分类部分,从而从预训练模型中转移知识,以基于单细胞基因调控网络 (SCGRN) 图像来分类健康对照和 T2D。另一个 DTL 模型以类似的方式工作,但只是保留特征提取中底层的权重不变,而通过从头开始训练来更新连续层的权重。在使用五折交叉验证的 224 个 SCGRN 图像的整个实验结果表明,我们的模型(TFeSEResNeXT101)实现了最高的平均平衡准确率 (BAC) 为 0.97,因此明显优于基线模型,其平均 BAC 为 0.86。此外,模拟研究表明,这种优越性归因于与预训练模型的权重相结合时,Adam 优化器获得的模型权重参数的分布一致性。