Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks2024 IEEE Intelligent Vehicles Symposium (IV)
Seiten2397-2404
Seitenumfang8
ISBN (elektronisch)979-8-3503-4881-1
PublikationsstatusVeröffentlicht - 6 Feb. 2024

Publikationsreihe

NameIEEE Intelligent Vehicles Symposium, Proceedings
ISSN (Print)1931-0587
ISSN (elektronisch)2642-7214

Abstract

In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

ASJC Scopus Sachgebiete

Zitieren

Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. / Xu, Yiming; Cheng, Hao; Sester, Monika.
2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404 (IEEE Intelligent Vehicles Symposium, Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Xu, Y, Cheng, H & Sester, M 2024, Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. in 2024 IEEE Intelligent Vehicles Symposium (IV). IEEE Intelligent Vehicles Symposium, Proceedings, S. 2397-2404. https://doi.org/10.48550/arXiv.2402.03981, https://doi.org/10.1109/IV55156.2024.10588486
Xu, Y., Cheng, H., & Sester, M. (2024). Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. In 2024 IEEE Intelligent Vehicles Symposium (IV) (S. 2397-2404). (IEEE Intelligent Vehicles Symposium, Proceedings). https://doi.org/10.48550/arXiv.2402.03981, https://doi.org/10.1109/IV55156.2024.10588486
Xu Y, Cheng H, Sester M. Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. in 2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404. (IEEE Intelligent Vehicles Symposium, Proceedings). doi: 10.48550/arXiv.2402.03981, 10.1109/IV55156.2024.10588486
Xu, Yiming ; Cheng, Hao ; Sester, Monika. / Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. 2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404 (IEEE Intelligent Vehicles Symposium, Proceedings).
Download
@inproceedings{fe1c66175b504b3494eec26ea859e19e,
title = "Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting",
abstract = "In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.",
keywords = "cs.CV",
author = "Yiming Xu and Hao Cheng and Monika Sester",
year = "2024",
month = feb,
day = "6",
doi = "10.48550/arXiv.2402.03981",
language = "English",
isbn = "979-8-3503-4882-8",
series = "IEEE Intelligent Vehicles Symposium, Proceedings",
pages = "2397--2404",
booktitle = "2024 IEEE Intelligent Vehicles Symposium (IV)",

}

Download

TY - GEN

T1 - Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

AU - Xu, Yiming

AU - Cheng, Hao

AU - Sester, Monika

PY - 2024/2/6

Y1 - 2024/2/6

N2 - In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

AB - In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

KW - cs.CV

UR - http://www.scopus.com/inward/record.url?scp=85199753378&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2402.03981

DO - 10.48550/arXiv.2402.03981

M3 - Conference contribution

SN - 979-8-3503-4882-8

T3 - IEEE Intelligent Vehicles Symposium, Proceedings

SP - 2397

EP - 2404

BT - 2024 IEEE Intelligent Vehicles Symposium (IV)

ER -

Von denselben Autoren