座舱-骆驼：通过大语言模型实现智能座舱中的驾驶员意图预测

Cockpit-Llama: Driver Intent Prediction in Intelligent Cockpit via Large Language Model.

作者信息

Chen Yi, Li Chengzhe, Yuan Qirui, Li Jinyu, Fan Yuze, Ge Xiaojun, Li Yun, Gao Fei, Zhao Rui

机构信息

College of Automotive Engineering, Jilin University, Changchun 130025, China.

Graduate School of Information and Science Technology, The University of Tokyo, Tokyo 113-8654, Japan.

出版信息

Sensors (Basel). 2024 Dec 25;25(1):64. doi: 10.3390/s25010064.

DOI:10.3390/s25010064

PMID:39796855

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11722838/

Abstract

The cockpit is evolving from passive, reactive interaction toward proactive, cognitive interaction, making precise predictions of driver intent a key factor in enhancing proactive interaction experiences. This paper introduces Cockpit-Llama, a novel language model specifically designed for predicting driver behavior intent. Cockpit-Llama predicts driver intent based on the relationship between current driver actions, historical interactions, and the states of the driver and cockpit environment, thereby supporting further proactive interaction decisions. To improve the accuracy and rationality of Cockpit-Llama's predictions, we construct a new multi-attribute cockpit dataset that includes extensive historical interactions and multi-attribute states, such as driver emotional states, driving activity scenarios, vehicle motion states, body states and external environment, to support the fine-tuning of Cockpit-Llama. During fine-tuning, we adopt the Low-Rank Adaptation (LoRA) method to efficiently optimize the parameters of the Llama3-8b-Instruct model, significantly reducing training costs. Extensive experiments on the multi-attribute cockpit dataset demonstrate that Cockpit-Llama's prediction performance surpasses other advanced methods, achieving BLEU-4, ROUGE-1, ROUGE-2, and ROUGE-L scores of 71.32, 80.01, 76.89, and 81.42, respectively, with relative improvements of 92.34%, 183.61%, 95.54%, and 201.27% compared to ChatGPT-4. This significantly enhances the reasoning and interpretative capabilities of intelligent cockpits.

摘要

驾驶舱正在从被动、反应式交互向主动、认知式交互发展，精确预测驾驶员意图成为提升主动交互体验的关键因素。本文介绍了Cockpit-Llama，一种专门为预测驾驶员行为意图设计的新型语言模型。Cockpit-Llama基于当前驾驶员操作、历史交互以及驾驶员和驾驶舱环境状态之间的关系来预测驾驶员意图，从而支持进一步的主动交互决策。为提高Cockpit-Llama预测的准确性和合理性，我们构建了一个新的多属性驾驶舱数据集，该数据集包含广泛的历史交互和多属性状态，如驾驶员情绪状态、驾驶活动场景、车辆运动状态、身体状态和外部环境，以支持对Cockpit-Llama的微调。在微调过程中，我们采用低秩自适应（LoRA）方法来高效优化Llama3-8b-Instruct模型的参数，显著降低训练成本。在多属性驾驶舱数据集上进行的大量实验表明，Cockpit-Llama的预测性能超过其他先进方法，BLEU-4、ROUGE-1、ROUGE-2和ROUGE-L分数分别达到71.32、80.01、76.89和81.42，与ChatGPT-4相比，相对提升分别为92.34%、183.61%、95.54%和201.27%。这显著增强了智能驾驶舱的推理和解释能力。