• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用车辆车道解缠条件变分自编码器的多任务轨迹预测

Multi-Task Trajectory Prediction Using a Vehicle-Lane Disentangled Conditional Variational Autoencoder.

作者信息

Chen Haoyang, Li Na, Shan Hangguan, Liu Eryun, Xiang Zhiyu

机构信息

The College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China.

出版信息

Sensors (Basel). 2025 Jul 20;25(14):4505. doi: 10.3390/s25144505.

DOI:10.3390/s25144505
PMID:40732633
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12298317/
Abstract

Trajectory prediction under multimodal information is critical for autonomous driving, necessitating the integration of dynamic vehicle states and static high-definition (HD) maps to model complex agent-scene interactions effectively. However, existing methods often employ static scene encodings and unstructured latent spaces, limiting their ability to capture evolving spatial contexts and produce diverse yet contextually coherent predictions. To tackle these challenges, we propose , a novel generative framework that introduces (1) a time-aware scene encoder that aligns HD map features with vehicle motion to capture evolving scene semantics and (2) a structured latent model that explicitly disentangles agent-specific intent and scene-level constraints. Additionally, we introduce an auxiliary lane prediction task to provide targeted supervision for scene understanding and improve latent variable learning. Our approach jointly predicts future trajectories and lane sequences, enabling more interpretable and scene-consistent forecasts. Extensive evaluations on the nuScenes dataset demonstrate the effectiveness of MS-SLV, achieving a 12.37% reduction in average displacement error and a 7.67% reduction in final displacement error over state-of-the-art methods. Moreover, MS-SLV significantly improves multi-modal prediction, reducing the top-5 Miss Rate (MR5) and top-10 Miss Rate (MR10) by 26% and 33%, respectively, and lowering the Off-Road Rate (ORR) by 3%, as compared with the strongest baseline in our evaluation.

摘要

多模态信息下的轨迹预测对于自动驾驶至关重要,这需要整合动态车辆状态和静态高清(HD)地图,以有效地对复杂的智能体-场景交互进行建模。然而,现有方法通常采用静态场景编码和无结构的潜在空间,限制了它们捕捉不断演变的空间上下文以及生成多样化但上下文连贯的预测的能力。为应对这些挑战,我们提出了MS-SLV,这是一个新颖的生成框架,它引入了(1)一个时间感知场景编码器,该编码器将高清地图特征与车辆运动对齐,以捕捉不断演变的场景语义,以及(2)一个结构化潜在模型,该模型明确地解开特定智能体的意图和场景级约束。此外,我们引入了一个辅助车道预测任务,为场景理解提供有针对性的监督,并改善潜在变量学习。我们的方法联合预测未来轨迹和车道序列,实现更具可解释性和场景一致性的预测。在nuScenes数据集上的广泛评估证明了MS-SLV的有效性,与最先进的方法相比,平均位移误差降低了12.37%,最终位移误差降低了7.67%。此外,与我们评估中最强的基线相比,MS-SLV显著改善了多模态预测,分别将前5误报率(MR5)和前10误报率(MR10)降低了26%和33%,并将越野率(ORR)降低了3%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/a5c38ba70055/sensors-25-04505-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/7620b7b30d64/sensors-25-04505-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/3fa77028b2b0/sensors-25-04505-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/c2163753c1b3/sensors-25-04505-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/eee5c896ee7f/sensors-25-04505-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/4d05eec2c9a8/sensors-25-04505-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/5c279ffe3dd1/sensors-25-04505-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/2455645a80ae/sensors-25-04505-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/a5c38ba70055/sensors-25-04505-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/7620b7b30d64/sensors-25-04505-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/3fa77028b2b0/sensors-25-04505-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/c2163753c1b3/sensors-25-04505-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/eee5c896ee7f/sensors-25-04505-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/4d05eec2c9a8/sensors-25-04505-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/5c279ffe3dd1/sensors-25-04505-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/2455645a80ae/sensors-25-04505-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7436/12298317/a5c38ba70055/sensors-25-04505-g008.jpg

相似文献

1
Multi-Task Trajectory Prediction Using a Vehicle-Lane Disentangled Conditional Variational Autoencoder.使用车辆车道解缠条件变分自编码器的多任务轨迹预测
Sensors (Basel). 2025 Jul 20;25(14):4505. doi: 10.3390/s25144505.
2
Social Reasoning-Aware Trajectory Prediction via Multimodal Language Model.通过多模态语言模型实现的社会推理感知轨迹预测
IEEE Trans Pattern Anal Mach Intell. 2025 Jun 20;PP. doi: 10.1109/TPAMI.2025.3582000.
3
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
4
Scene as Occupancy and Reconstruction: A Comprehensive Dataset for Unstructured Scene Understanding.场景占用与重建:用于非结构化场景理解的综合数据集
Sci Data. 2025 Jul 15;12(1):1232. doi: 10.1038/s41597-025-05532-5.
5
Multimodal medical image-to-image translation via variational autoencoder latent space mapping.通过变分自编码器潜在空间映射实现多模态医学图像到图像的转换。
Med Phys. 2025 Jul;52(7):e17912. doi: 10.1002/mp.17912. Epub 2025 May 29.
6
Noise-aware system generative model (NASGM): positron emission tomography (PET) image simulation framework with observer validation studies.噪声感知系统生成模型(NASGM):用于正电子发射断层扫描(PET)图像模拟框架及观察者验证研究。
Med Phys. 2025 Jul;52(7):e17962. doi: 10.1002/mp.17962.
7
Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力:结合混合方法和多模态分析的综合框架
JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
9
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
10
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

引用本文的文献

1
A Driving-Preference-Aware Framework for Vehicle Lane Change Prediction.一种用于车辆变道预测的驾驶偏好感知框架。
Sensors (Basel). 2025 Aug 28;25(17):5342. doi: 10.3390/s25175342.

本文引用的文献

1
Vision-Based Multi-Future Trajectory Prediction: A Survey.基于视觉的多未来轨迹预测:一项综述。
IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):13691-13708. doi: 10.1109/TNNLS.2025.3550350.