Liu Mingjie, Chen Kuiyou, Ye Qing, Wu Hong
School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China.
Computer Science and Technology Department, Donghua University, Shanghai 201620, China.
J Biomed Inform. 2024 Dec;160:104757. doi: 10.1016/j.jbi.2024.104757. Epub 2024 Dec 2.
Post-discharge follow-up stands as a critical component of post-diagnosis management, and the constraints of healthcare resources impede comprehensive manual follow-up. However, patients are less cooperative with AI follow-up calls or may even hang up once AI voice robots are perceived. To improve the effectiveness of follow-up, alternative measures should be taken when patients perceive AI voice robots. Therefore, identifying how patients perceive AI voice robots is crucial. This study aims to construct a multimodal identity perception model based on deep learning to identify how patients perceive AI voice robots.
Our dataset includes 2030 response audio recordings and corresponding texts from patients. We conduct comparative experiments and perform an ablation study. The proposed model employs a transfer learning approach, utilizing BERT and TextCNN for text feature extraction, AST and LSTM for audio feature extraction, and self-attention for feature fusion.
Our model demonstrates superior performance against existing baselines, with a precision of 86.67%, an AUC of 84%, and an accuracy of 94.38%. Additionally, a generalization experiment was conducted using 144 patients' response audio recordings and corresponding text data from other departments in the hospital, confirming the model's robustness and effectiveness.
Our multimodal identity perception model can identify how patients perceive AI voice robots effectively. Identifying how patients perceive AI not only helps to optimize the follow-up process and improve patient cooperation, but also provides support for the evaluation and optimization of AI voice robots.
出院后随访是诊断后管理的关键组成部分,而医疗资源的限制阻碍了全面的人工随访。然而,患者对人工智能随访电话的配合度较低,甚至一旦察觉到人工智能语音机器人就可能挂断电话。为提高随访效果,当患者察觉到人工智能语音机器人时应采取替代措施。因此,了解患者如何看待人工智能语音机器人至关重要。本研究旨在构建基于深度学习的多模态身份感知模型,以识别患者如何看待人工智能语音机器人。
我们的数据集包括2030条患者的回应录音及相应文本。我们进行了对比实验并开展了消融研究。所提出的模型采用迁移学习方法,利用BERT和TextCNN进行文本特征提取,利用AST和LSTM进行音频特征提取,并利用自注意力进行特征融合。
我们的模型相对于现有基线表现出卓越性能,精确率为86.67%,AUC为84%,准确率为94.38%。此外,使用来自医院其他科室的144名患者的回应录音及相应文本数据进行了泛化实验,证实了该模型的稳健性和有效性。
我们的多模态身份感知模型能够有效识别患者如何看待人工智能语音机器人。了解患者如何看待人工智能不仅有助于优化随访流程并提高患者配合度,还为人工智能语音机器人的评估和优化提供支持。