School of Communication & Information Engineering, Shanghai University, Shanghai, 200444, China.
School of Life Science, Shanghai University, Shanghai, 200444, China.
Chin J Integr Med. 2024 Feb;30(2):163-170. doi: 10.1007/s11655-022-3541-8. Epub 2022 Nov 14.
To develop a multimodal deep-learning model for classifying Chinese medicine constitution, i.e., the balanced and unbalanced constitutions, based on inspection of tongue and face images, pulse waves from palpation, and health information from a total of 540 subjects.
This study data consisted of tongue and face images, pulse waves obtained by palpation, and health information, including personal information, life habits, medical history, and current symptoms, from 540 subjects (202 males and 338 females). Convolutional neural networks, recurrent neural networks, and fully connected neural networks were used to extract deep features from the data. Feature fusion and decision fusion models were constructed for the multimodal data.
The optimal models for tongue and face images, pulse waves and health information were ResNet18, Gate Recurrent Unit, and entity embedding, respectively. Feature fusion was superior to decision fusion. The multimodal analysis revealed that multimodal data compensated for the loss of information from a single mode, resulting in improved classification performance.
Multimodal data fusion can supplement single model information and improve classification performance. Our research underscores the effectiveness of multimodal deep learning technology to identify body constitution for modernizing and improving the intelligent application of Chinese medicine.
基于 540 名受试者的舌象和面相图像、脉象和健康信息,开发一种用于中医体质分类(即平衡体质和不平衡体质)的多模态深度学习模型。
本研究数据包括舌象和面相图像、脉象以及健康信息,包括个人信息、生活习惯、病史和当前症状,来源于 540 名受试者(男性 202 名,女性 338 名)。使用卷积神经网络、递归神经网络和全连接神经网络从数据中提取深度特征。为多模态数据构建了特征融合和决策融合模型。
舌象和面相图像、脉象和健康信息的最优模型分别为 ResNet18、门控循环单元和实体嵌入。特征融合优于决策融合。多模态分析表明,多模态数据弥补了单一模态信息的损失,从而提高了分类性能。
多模态数据融合可以补充单一模型信息,提高分类性能。我们的研究强调了多模态深度学习技术在识别体质方面的有效性,为中医的现代化和智能应用提供了改进。