Yang Xuliang, Pan Aimin, Raga Rodolfo C
University and Urban Integration Development Research Center, Dongguan City University, 523419, Dongguan, China.
College of Computing and Information Technologies, National University-Manila, Manila, 1008, Philippines.
Sci Rep. 2025 Apr 28;15(1):14850. doi: 10.1038/s41598-025-97662-4.
With the rapid development of artificial intelligence technology, digital human education platforms have become a research hotspot in education. This paper proposes a method to build a multi-modal digital human education platform based on a Generative Adversarial Network and a Vision Transformer. The platform enables high-quality avatar generation and interactive learning experiences. In the experimental part, we construct a large-scale dataset containing 1000 students and 50 teachers to evaluate the performance of the proposed method. The experimental results show that the proposed method has significantly improved avatars' authenticity, interaction response speed, and learning effect by comparing them with existing digital human education platforms. Specifically, the average recognition accuracy of avatars has increased by 12%, the interaction response time has been shortened by 25%, and students' academic performance has increased by 8% on average. This shows that the multi-modal digital human education platform based on GAN and ViT has excellent application potential and can provide new solutions for future education models.
随着人工智能技术的快速发展,数字人教育平台已成为教育领域的研究热点。本文提出了一种基于生成对抗网络和视觉Transformer构建多模态数字人教育平台的方法。该平台能够实现高质量的虚拟形象生成和交互式学习体验。在实验部分,我们构建了一个包含1000名学生和50名教师的大规模数据集,以评估所提方法的性能。实验结果表明,通过与现有的数字人教育平台进行比较,所提方法在虚拟形象的真实性、交互响应速度和学习效果方面有显著提升。具体而言,虚拟形象的平均识别准确率提高了12%,交互响应时间缩短了25%,学生的学业成绩平均提高了8%。这表明基于GAN和ViT的多模态数字人教育平台具有优异的应用潜力,可为未来教育模式提供新的解决方案。